Exploiting exotic LHC datasets for long-lived new particle searches

Motivated by the expectation that new physics may manifest itself in the form of very heavy new particles, most of the operation time of the LHC is devoted to $pp$ collisions at the highest achievable energies and collision rates. The large collision rates imply tight trigger requirements that include high thresholds on the final-state particles' transverse momenta $p_{T}$ and an intrinsic background in the form of particle pileup produced by different collisions occurring during the same bunch crossing. This strategy is potentially sub-optimal for several well-motivated new physics models where new particles are not particularly heavy and can escape the online selection criteria of the multi-purpose LHC experiments due to their light mass and small coupling. A solution may be offered by complementary datasets that are routinely collected by the LHC experiments. These include heavy ion collisions, low-pileup runs for precision physics, and the so-called 'parking' and 'scouting' datasets. While some of them are motivated by other physics goals, they all have the usage of mild $p_{T}$ thresholds at the trigger-level in common. In this study, we assess the relative merits of these datasets for a representative model whose particular clean signature features long-lived resonances yielding displaced dimuon vertices. We compare the reach across those datasets for a simple analysis, simulating LHC data in Run 2 and Run 3 conditions with the Delphes simulation. We show that the scouting and parking datasets, which afford low-$p_{T}$ trigger thresholds by only using partial detector information and delaying the event reconstruction, respectively, have a reach comparable to the standard $pp$ dataset with conventional thresholds. We also show that heavy ion and low-pileup datasets are far less competitive for this signature.


Introduction
Since the inception of the Large Hadron Collider (LHC), the experiments at its ring have collected a tremendous amount of data and utilised them to search for hints of physics beyond the Standard Model (BSM). Most of the run time at the LHC and the data stored by its main experiments are dedicated to high-energy proton-proton (pp) collisions. However, a fraction of time and storage are also devoted to special datasets motivated by specific physics goals. Examples include heavy ion runs (nucleus-nucleus or proton-nucleus collisions), motivated by the study of the quark-gluon plasma and other high-energy nuclear physics phenomena, and low-luminosity pp runs designed to provide clean events for precise measurements of some Standard Model (SM) parameters for which no large amounts of data are necessary.
Any dataset, regardless of its nature, is filtered by triggers, i.e. sets of selection criteria that are applied online, based on partial detector information in order to provide a sufficiently rapid decision, since storing all collision events is impossible for a hadron collider such as the LHC. Both ATLAS and CMS can store events on tape at a maximum rate of the order of one thousand events per second, shared across a large number of trigger paths. Therefore, the bandwidth allocated to each trigger is limited, with a larger bandwidth allocated to the triggers that give more sensitivity to the studies assigned higher priority by the collaborations that operate the experiments.
As the LHC luminosity increases, there are two basic strategies at the trigger-level to cope with the increased collision rate within a fixed bandwidth: tightening the selection criteria and, in particular, the p T thresholds, or keeping the same thresholds but recording only a subset of the events that would otherwise pass the trigger selection. The latter strategy is called prescaling. The former strategy is traditionally favoured in BSM searches, based on the expectation that any new particles would be massive and therefore decay into very high p T final-state particles or generate a large transverse momentum imbalance E miss T when particles remain undetected. Such expectation is supported by the most popular models of new physics, such as supersymmetry, extra dimensions, and gauge unification models.
Trigger thresholds of LHC experiments are carefully optimised such that the primary stored pp dataset is sensitive to as many new physics signals as possible. However, it has been suggested that new physics could be generated in collisions at the LHC but is then disregarded by the trigger requirements; see e.g. [ -] and references therein. This is especially true for BSM models that predict soft decays, i.e. which do not lead to events with large p T particles or large E miss T . Examples include axion-like particles (ALPs), heavy neutral leptons (HNLs), or new gauge bosons, each with masses of an order of a few GeV [ -]. While the authors of reference [ ] argue for new ways of triggering for the standard dataset, we further argue that some existing triggers have sufficiently low thresholds in some non-standard datasets. Thus, we explore whether enhancing the resources allocated to those exotic datasets might be convenient in future runs and whether or not they can be promising in this respect.
This article reports a comparative study of the prospects to constrain this kind of BSM physics using the standard pp collisions with large pileup, a low-pileup pp sample, heavy ion collisions, and pp collisions saved with the scouting and parking approaches. We rely on simulated data where the Delphes fast detector simulation [ ] is used to emulate the detector effects corresponding to a generic LHC multi-purpose detector in Run and Run conditions.
This article is organised as follows. Section presents a specific category of BSM signals, predicted in so-called hidden sector models, which are notoriously difficult to detect by the multi-purpose LHC experiments, and elaborates on a particular signature (displaced dimuons) that is found in several of those models and is experimentally clean. The alternative datasets collected by LHC experiments are described in section . Section provides details on the simulation of signal and background events under the conditions corresponding to the various datasets considered. Section describes a simple data-analysis strategy for identifying the benchmark signature, whose results are presented in section . Finally, we summarise the lessons learned in section . The predictions for the sensitivities achievable during Run are collected in appendix A and the code used for the generation of signal and backgrounds is given in appendix B.

Low scale hidden sectors
Many models predict new feebly interacting light degrees of freedom, such as ALPs, HNLs, and additional gauge bosons, potentially as messengers to an extensive hidden sector [ , ]. Despite their small masses, they can have evaded detection until now due to their small couplings, which can induce tiny production cross sections and unusual long lifetimes. However, they might appear in high-luminosity experiments or dedicated low-energy collider datasets. Therefore, the exotic datasets considered here present an exciting opportunity to search for manifestations of such models. For this study, we use a benchmark model with a scalar particle and adjust its coupling such that it is long-lived and therefore decays in a secondary vertex. While the coupling strength in this specific simplified model is not particularly well-motivated on its own, the signature of a displaced decay with soft tracks generated by this simple model is a common prediction of more complicated models and a worthwhile target for dedicated searches. Therefore, we treat this simplified benchmark model as a proxy for a class of BSM models featuring such a signature. .

Benchmark model
Additional particles with spin zero can solve significant problems in particle physics, such as the strong CP problem of quantum chromodynamics (QCD). One famous example is the axion originating in the spontaneous breaking of the anomalous global U(1) Peccei-Quinn symmetry [ ]. The associated (pseudo) Nambu-Goldstone boson (PNGB) has a fixed relation between coupling strength, mass, and the symmetry-breaking scale. However, here we consider more generic pseudoscalar particles, generally called ALPs, which are not subject to this restriction.
Effective field theories (EFTs) provide a model-independent approach to parameterising the effects of potential new physics in low-energy data [ ]. One example is the Standard Model effective field theory (SMEFT) [ ] built in terms of towers of operators constructed out of the SM fields and respecting the SM symmetries, ordered by their mass dimension. It is possible to add the interactions of generic PNGBs to the SMEFT [ ]. The resulting model has been published in the form of the FeynRules [ ] model file ALPsEFT [ ]. The part of the ALPs EFT Lagrangian relevant to this study is where a is the ALPs field with mass m a and decay constant f a . The coefficients of the coupling with the gluon field strength tensor G µν and its dual G µν ≡ µνρσ G ρσ /2 is cG (CGtil) while after electroweak symmetry breaking its coupling to fermions f with mass m f is c aφ (CaPhi). In simple models, the decay constant f a is generically much larger than the electroweak scale. However, in ALPs models, it suffices to adjust it such that this new physics scale is beyond the reach of current experiments f a ∼ O(TeV). The values of the couplings in the vicinity of the parameter space we are interested in is constrained by prior experiments. For example, for ALP masses of m a 60 MeV a limit on the ALP-gluon coupling of order 4cG/f a 10 −5 GeV −1 has been set at % confidence level (CL) and for masses of 1 MeV m a 3 GeV a limit on the ALP-fermion interaction of the order of |c aφ |/f a < (10 −8 -10 −6 ) GeV −1 has been set at % CL [ ].

Datasets
In the following, we introduce the LHC datasets compared in this study. We present the actual conditions for those datasets during the LHC Run , which took place between and , and those that can be realistically expected for the next few years during the recently started Run ( -). For the latter, it is important to stress that the resources allocated to non-standard datasets, meaning the running time, the bandwidth, and the data volume, depend on decisions based on the community consensus regarding the scientific priorities of the experiments and negotiations between several analysis teams. Consequently, for Run , any estimate of the future amount of events is essentially an educated guess. Therefore, the predictions of this study for future datasets are presented in a form that is easily scalable to different amounts of data.

.
Standard pp dataset During most of the LHC running time, the aim is to maximise the delivered instantaneous luminosity. The proton beams cross every ns, and several pp collisions occur during the same bunch crossing, leading to the pileup of several uncorrelated collisions into a single recorded event. This results in a nuisance for the measurements, as it obfuscates the interpretation of the events, particularly the kinematic reconstruction of decay chains.
During Run , each of the two multi-purpose experiments ATLAS and CMS accumulated 140 fb −1 at a center of mass (CM) energy of √ s = 13 TeV. While the average pileup for both experiments was about 35 interactions per recorded event, there were large differences in the pileup profile between different data-taking periods [ , ]. Typical muon selection thresholds are around and GeV for single-muon and dimuon triggers, respectively, see e.g. [ , ]. The goal for the recently-started Run is to accumulate about twice that amount of data at the slightly larger collision energy of √ s = 13.6 TeV. Various accelerator parameters will be tuned towards that end. One of the side effects will be an increase in the average pileup, which will roughly double with respect to Run . Although considerably lower than the very high pileup conditions expected during the runs of the high luminosity phase of the Large Hadron Collider (HL-LHC) [ ], i.e. from Run onward, such increase of the average pileup in Run may lead to a slight degradation of the performances of the particle identification and reconstruction algorithms of the experiments. One example is that it may affect the discriminating power of lepton isolation and the identification of displaced vertices. For simplicity, we approximated the pileup profile of Run by generating the corresponding Monte Carlo (MC) sample with a mean of 35 pileup interactions per event at √ s = 13 TeV. Similarly, to emulate Run data, we generated a MC sample with a mean of 70 interactions per event at √ s = 13.6 TeV.
. Low-pileup dataset For a minimal time during every run, the LHC is operated on purpose at very low instantaneous luminosity to ensure low-pileup conditions. Such an experimentally clean dataset can be optimally exploited for measurements that are not limited by statistical uncertainty but are critically affected by detector systematics. An example of this measurement is the determination of the W boson mass, a critical input to any SM consistency check. In the context of our study, which is related to long-lived particles (LLPs), the absence of pileup brings the specific advantage of ideal identification of the primary interaction vertex and precise reconstruction of secondary vertices.
The Run low-pileup dataset at √ s = 13 TeV mostly consists of about 0.2 fb −1 collected in a few days at the end of , with an average pileup of two collisions per bunch crossing [ ]. Given the negligible degradation of detector performances at such a low pileup, we approximate these conditions by simulating a dataset without any pileup. Additional 0.3 fb −1 of low-pileup data were collected at √ s = 5.02 TeV in and , primarily to serve as reference data for heavy ion studies but are not considered here. In this study, we assume a muon p T threshold of and GeV for the single-muon and dimuon trigger, respectively, as applied by CMS in the low-pileup runs of [ ]. It is not easy to foresee the amount of low-pileup data that the multi-purpose experiments will be willing to accumulate in Run . The priority for this kind of data may have risen after the recent publication of the legacy CDF measurement of the W mass [ ], in tension with previous measurements and therefore urgently demanding updates from the LHC experiments. In this study, we assume 0.5 fb −1 in Run , i.e. the same amount of low-pileup data as during the entire Run but wholly at high energy, which, in this case, means √ s = 13.6 TeV. This assumption is arbitrary and shall be understood as merely representing an order of magnitude estimate.

Scouting dataset
The concept of 'scouting' was introduced by the CMS experiment in Run [ -]. After a pilot scouting run in , it has been operated regularly since in CMS. During Run , it has also been used by ATLAS [ , ] (where it is called 'trigger-level analysis') and LHCb [ ] (where it is known as 'turbo stream').
Scouting is based on assigning a fraction of the bandwidth to a stream of data with reduced event content. The name comes from the possibility of scouting those data very early for the presence of striking signs of new physics (e.g. narrow resonances) that do not require the full power of a holistic offline analysis to be identified and that would otherwise be filtered away by the tight standard triggers. Scouting events are acquired in parallel with standard pp data. Therefore, they share the same accelerator conditions, particularly the amount of pileup. Only a minimal amount of high-level information from the online reconstruction is stored for the selected events. This implies that the reconstruction of the physics objects is less precise in this dataset than in the standard pp one, as the online reconstruction algorithms are optimised more for speed than for resolution or other performance metrics. However, this strategy permits a more significant fraction of events to be stored and analysed, allowing, in particular, looser trigger thresholds.
The LHC experiments have used the scouting data to search for dijet [ , ] and dimuon [ ] resonances, as well as LLPs [ ]. Similarly to the scouting-based search reported in reference [ ], this study is based on a two-muon final state signature; therefore, we assume the same Run integrated luminosity as utilised in that publication. Although the CMS scouting stream was taking data during the entire Run , a high-rate dimuon trigger appropriate for the type of analysis studied in this paper was employed only in and , with a dimuon p T threshold of GeV, accumulating 96.6 fb −1 [ ]. For Run , CMS reverts to the original scouting design entirely based on particle-flow objects, increases the bandwidth, and makes the data format more offline-like. In our study, instead of doubling the Run dataset as assumed for the standard pp dataset, we assume a factor of three increase. This accounts for the following factors: • The dimuon trigger appropriate for this final state was not applied in all sub-periods of Run , while it is likely to be used from start to end in Run and future data acquisitions.
• Since scouting is now a more established concept whose usefulness is more broadly appreciated than just a few years ago, larger bandwidth is now allocated to it.
However, such an increase of usable scouting data from Run to Run is, to a large extent, an educated guess, as the bandwidth allocated to scouting is a tuneable parameter to be fixed according to community priorities. .

Parking dataset
The 'parking' concept consists in storing a fraction of the triggered data without running the prompt reconstruction algorithms. The event reconstruction is delayed to later periods without data taking, when the experiments' computing resources are not used at full capacity. The CMS experiment applied this concept for the first time with about twelve billion events collected in , of which roughly ten billion are estimated to contain b-quarks [ ], which were processed after the end of Run . This dataset was specifically called 'b-parking' because it was motivated by b-physics goals and therefore designed such that it mostly contained events with b-quarks, with a set of triggers that included most notably a non-isolated single-muon path with a p T threshold of GeV. In principle, parking can be optimised for other physics studies and additional trigger paths with the same underlying approach of loosening the selection with respect to the standard LHC dataset are being considered.
In this study, we work with a simplified definition of a b-parking dataset, which assumes that all the events are collected through a single-muon path with displacement but without requiring the muons to be isolated, meaning they do not have to be separated from other particles in the same event.
The number of events to be collected by parking during Run is potentially much larger compared to Run , but difficult to foresee as it depends on future available computing resources and priorities within the LHC experiments. For Run , we scale the number of events by a factor of four, assuming that for every year of running, it will be possible to park as many events as in the b-parking dataset of . Since resources are saturated at the start of a LHC fill when the instantaneous luminosity and hence the pileup is most significant, events start to be parked only when the instantaneous luminosity falls below a specific value. Therefore, pileup is, in principle, milder in the parking dataset than in the standard pp one. This effect is challenging to model reliably, so for simplicity, in this study, we conservatively assume that the average pileup is identical to the standard pp dataset.

.
Heavy ion dataset Besides the standard pp collision, the LHC experiments collect heavy ion data. While primarily collected in PbPb collisions, heavy ion data can also be gathered from pPb ones. Moreover, some amount of OO and pO data are scheduled for Run , and several other options are being discussed for the HL-LHC [ , ]. Typically one month of every year is allocated to heavy ion operations, compared to six or seven months for pp collisions in a typical LHC year. The primary purpose of these special runs is to accumulate a deeper understanding of the quark-gluon plasma known to have permeated the early universe. Related topics in high-energy nuclear physics are also addressed with the same data. Furthermore, it has been recently realised that the heavy ion dataset constitutes a suitable environment for a few new physics searches, for which analyses based on these data can outperform the pp collisions-based ones [ , ]. For example, searches for magnetic monopoles [ -] and ALPs [ -] have been carried out. It has also been proposed to search for LLPs in heavy ion collisions [ , ], as well as in electron-ion ones [ ]. Additionally, attention has been given to the searches for dark photons [ ], strangelets [ , ], and sphalerons [ ], also to studying g − 2 of the τ -lepton [ , ].
Thanks to the relatively low event rate, dimuon events can be selected at the trigger level without any explicit p T threshold (see e.g. [ , ]), which roughly corresponds to an Low-pileup 17 8 0.2 0.5 Zero pileup Heavy ion -3 1.6 × 10 −6 9.6 × 10 −6 Only PbPb Table : Comparison of the p T thresholds of the single muon and dimuon triggers simulated in this study, together with the integrated luminosities collected during the LHC Run and the integrated luminosity expected after Run for all datasets considered in this paper. Note that the integrated luminosity of the parking dataset is estimated from the number of events collected by CMS in Run after the comparison with the b-quark cross section calculated using Pythia.
implicit threshold of GeV arising from the maximum track curvature that permits it to cross the detector.
The main limitation of the heavy ion datasets is that their maximum instantaneous luminosity is orders of magnitude lower than that of pp collisions. This is due to the disruptive electromagnetic effects that depend on large powers of the atomic number Z and therefore penalise heavy ion collisions much more than light ones. This is discussed at length e.g. in [ ], while the LLP search proposed in [ , ] scans over several nuclear species to identify the most optimal one taking those effects into account. Moreover, the maximum achievable CM energy per nucleon is smaller than that of the standard pp operations since the acceleration depends on Z while the inertia depends on the mass number A. The CM energy per nucleon has been . TeV for PbPb collisions during Run . At the time of writing, the LHC experiments discuss the CM energy to be used in Run , particularly whether keeping the same CM energy as during Run or taking data at higher energy. The theoretical maximum that the LHC can potentially reach is . TeV.
Two advantages of heavy ion data partially compensate for these limitations. The multiplicity of partonic interactions scales approximately as A 2 , which in the case of 208 82 Pb beams implies a factor of ∼ 4 × 10 4 enhancement, and the triggers are much looser because of the low instantaneous luminosity. Moreover, heavy ion collisions provide a low-pileup environment, yet the track multiplicity is much higher than in individual pp collisions. This difference in track multiplicity is largely compensated when pp collisions include O(100) pileup interactions per event, as calculated in [ ]. The large track multiplicity does not dramatically affect the analysis performance for clean final states with muons. In contrast, vertex multiplicity is essential if the signal muons are predicted to be displaced.
The enhancement is even more spectacular for processes initiated by γγ interactions, as in that case, the cross sections scale as Z 4 . This effect has been exploited in the searches presented in [ , , ]. Figure : Feynman diagram of the pp initialised ALPs signal process introduced in section . . The ALP constitutes a simple example for the light, feebly interacting, and long-lived new particle signatures considered in this paper.
For this study, we do not consider the impact of data collected during the collisions of other ions.
For the datasets presented in this section, table summarises the lowest threshold for the single-muon and dimuon triggers together with the integrated luminosity collected during Run and the one assumed for Run . Furthermore, we give the main approximation we have applied while simulating these datasets.

Monte Carlo simulation
In sections . and . , we present the details of the MC simulations of the signal and background events, respectively, for all the datasets. Subsequently, we discuss the simulation of the detector effects in section . . All adjustments to the cards are collected in appendix B. .

Signal simulation
The signal events are generated using the FeynRules [ ] model file ALPsEFT [ ]. The signal process consists of an ALP in the s-channel decaying to two muons as shown by the Feynman diagram in figure and is generated using MadGraph _aMC@NLO . [ ].
The incoming b-quarks are chosen to be massless and included in the parton distribution functions (PDFs) through the use of the five flavour scheme ( FS). The Wilson coefficients of the operators that modify the interaction of the ALPs to fermions and gluons are chosen to be cG = c aφ = 10 −5 while the other coefficients of the model are set to zero. The small effective coupling ensures the ALPs are long-lived and thus displaced. The partial width of the ALP into its different decay channels is calculated automatically by MadGraph and we do not impose any generation-level cuts.

Proton collisions
The pp collisions are simulated at a CM energy of √ s = 13 TeV and √ s = 13.6 TeV for Run and Run samples, respectively, using the PDF № from the LHAPDF [ ] set containing the NNPDF . next-to-leading order (NLO) global fit with α s (m Z ) = 0.119 [ ].
Heavy ion The heavy ion collisions are simulated at a CM energy of √ s = 5.02 TeV and √ s = 5.5 TeV for Run and Run samples, respectively, using the PDF № from the LHAPDF set encoding the EPPS nuclear PDF based on the CT proton PDF at NLO with running α s for 208 82 Pb [ , ]. Since MadGraph simulates the cross section for a single nucleon the resulting cross section has to scaled up by a factor of e.g. 208 2 in the case of lead. .
Background simulation Background events are simulated using Pythia .
[ ] by generating b-jets via QCD in gg and qq initial-states above a minimum p T threshold of GeV. This phase-space corresponds to a p T requirement on the generated B-mesons. We only consider background events that contain at least one b-quark and one anti-b-quark with final state muons. The background generation is further optimised by imposing additional generation-level cuts: • An upper limit on the vertex radius of the dimuon system of r v < 2.5 mm • A minimum threshold on the muons' transverse momenta of p T (µ) > 2 GeV • A lower limit on the invariant mass of the dimuon system of m µµ > 0.5 GeV • The muons' transverse impact parameter is required to be greater than |d µ 0 | > 0.5 mm The last cut reflects that we are primarily interested in a background environment characterised by low-p T and high displacement and is inspired by the selection of [ ], which was optimised for the even harsher pileup conditions expected for the HL-LHC.
Parking For the parking dataset, we generate gg/qq → bb events in Pythia. We apply the requirement of having at least one muon or anti-muon per generated event. The events are then passed through our analysis code which removes ∼ 50 % of the generated events. Therefore, the cross section obtained from Pythia is scaled accordingly. The calculated cross section can be used to translate the number of events into an equivalent integrated luminosity, and we obtain that the 10 9 b-parking events stored during Run [ ] correspond to an equivalent luminosity of 48.8 fb −1 , see also table .
Heavy ion For the heavy ion background simulation, we use the Angantyr model integrated into Pythia [ ]. Extrapolating the dynamics of pp collisions to the ones of nuclei, this model builds up the complete hadronic final states in high energy nuclear collisions. .

Detector simulation
The LHC detector effects are simulated with Delphes . . [ ] using a modified Delphes card, based on a standard CMS card provided with the software package. For the purpose of this study, the differences between the ATLAS and CMS detectors are not considered essential. Therefore, we generalise our conclusions to both multi-purpose LHC detectors.
Delphes allows to simulate the effect of pileup, and the user can control the emulation of necessary detector nuisances such as inefficiencies, misidentification, and loss of precision, which are adapted to the different scenarios considered in this comparative study.
The Delphes cards define the parameters of the detector simulation.

Pileup
The pileup events are added to the hard events during the detector simulation in Delphes using the built-in pileup card that utilises Pythia's ability to generate soft QCD events. Except for some rare SM processes, these events are designed to represent the total cross section at a hadron collider. Out of a reservoir of 5 × 10 4 minimum bias events , we add for each hard event an average of 35 pileup events for Run simulations and 70 pileup events for Run simulations.
Scouting As discussed in section , the quality of event reconstruction in the scouting dataset is reduced compared to the usual pp runs. Therefore, for the simulation of the scouting data, we have degraded the momentum resolution of muons according to the formula given in appendix B, which is inspired by the degradation observed in CMS scouting data with dimuon events [ ]. In that CMS study, the degradation in m µµ resolution depended significantly on the p T and the |η| of the two muons. The p T resolution of muons with p T < 50 GeV is % in the central barrel region of the detector, defined by |η| < 0.9, and % in the end caps of the muon system, defined by |η| > 1.2. A smooth interpolation between and % has been implemented in our simulation. A cut of |η| < 1.9 was applied in reference [ ] because higher |η| values lead to a much less pure sample of muon candidates, and there is not enough information at the scouting level to be able to clean the dataset. Consequently, we apply the same cut in our analysis. Although those details are taken from a CMS publication, we assume that similar constraints would be motivated in any LHC analysis. Finally, we also impose a threshold of p T > 3 GeV on the analysis level to account for the geometrical acceptance of the detector, which is discussed in detail in section .
Parking For the parking dataset, we use the track smearing module of Delphes to calculate the error on the track impact parameter. The latter is used in defining the impact parameter significance (IPS), which is utilised in the parking analysis, as mentioned below.

Analysis
The baseline output of Delphes allows identifying only the primary vertex of a given event.
Since we are primarily interested in long-lived topologies and displaced vertices, we have implemented a vertexing algorithm for reconstructing the dimuon system arising from the LLPs decay. In this vertexing algorithm, we only take as input the tracks identified as muon candidates by the dedicated identification module in Delphes, which includes both an emulation of inefficiencies as well as fakes. Our algorithm first identifies potential dimuon system candidates and then proceeds by imposing the p T trigger thresholds collected in table . A further refinement is attained by tightening the selection cuts used in the background simulation to r v < 2 mm and |d µ 0 | > 1 mm. Additional requirements are needed to simulate the scouting and the parking datasets.
Minimum bias events are defined experimentally as the most inclusive data the experiment can trigger on. In Delphes, they are defined by a set of Pythia generator cards that comprise inelastic pp collisions with a large cross section [ , ]. Scouting Given that higher values of the pseudorapidity |η| lead to much fewer clean muon candidates, rejecting forward candidates is necessary for a clean reconstruction of the dimuon system as mentioned in section . . We, therefore, impose a cut of |η µ | < 1.9 on both muon tracks as in reference [ ].
Parking For the simulation of the parking sample, and in line with the CMS level-muon trigger logic [ ], we impose a similar requirement, namely a cut of |η µ | < 1.5 on at least one muon candidate. In order to further improve the trigger purity, at least one muon candidate is required to pass the lower threshold in the track IPS, i.e. |d 0 /d err 0 | > 6 [ ], where d err 0 is the error associated to the d 0 measurement and calculated by Delphes.
During the analysis the two muon tracks of the dimuon system are sorted according to their displacement in terms of the transverse impact parameter |d µ 0 |. The dimuon candidate that has the highest displaced track is selected. Our interest in topologies with highly displaced muon tracks motivated the analysis strategy. .

Trigger scenarios
The analysis was performed for three trigger scenarios: the single muon, dimuon, and hybrid trigger scenario. The single muon and hybrid trigger scenarios are characterised by the requirement of having a single muon candidate passing the single muon p T threshold. The difference between these triggers is that the single muon trigger requires only one We have checked that the number of events with more than one dimuon candidate is negligible and have therefore not specified which potential dimuon candidate must be taken in such a case.
-- muon to pass the displacement cut on the transverse impact parameter, i.e. |d 0 | > 1 mm. In contrast, the hybrid trigger imposes this requirement on both muon candidates. The dimuon trigger scenario requires both muon candidates to pass the dimuon p T threshold and the displacement cut mentioned above. The two trigger thresholds for the different datasets are collected in table , and the three trigger scenarios are summarised in table .
We conclude this section by presenting the signal distributions for the mass of the dimuon system and the track impact parameter in nine different mass points in the scouting dataset using the hybrid trigger thresholds in figures a and a, respectively. These plots serve as an assessment of our vertexing algorithm's efficiency. The background distributions for the same two observables for all the datasets and using the hybrid trigger thresholds are given in figures b and b. In the background distributions, aligning with our expectation, the 'onia' peak is apparent at around GeV, corresponding to the J/ψ meson resonances.

Results and discussion
Since we do not intend to investigate a particular new physics model but rather comment on the comparative potentials to discover new physics signals across the different datasets collected at the LHC, we refrain from giving an absolute unit for the significance. The relevant information encoded in our results is the relative strength for discovering a displaced low-p T signature compared between the different datasets and trigger scenarios. The signal significance is defined as Z = s/ √ s + b, where s and b are the numbers of signal and background events predicted in the dimuon mass window, respectively. The --

Muon Hybrid Dimuon p T threshold Single muon Dimuon
Displaced muons 1 2 Table : Summary of the number of muons the p T threshold is imposed on as well as the number of displaced muons in the three trigger scenarios used in this analysis. The single muon and dimuon p T thresholds for each of the datasets are given in table . Since only the dimuon trigger applies to the scouting and heavy ion dataset, we omit these datasets here.
where s MC and b MC are the number of signal and background events counted in the corresponding MC samples that fall within the dimuon mass window. The f s and f b scale factors are defined by the luminosity ratios L data /L MC . The MC luminosity is obtained by normalising the number of generated MC events to the process cross section. For example, L bkg MC = N bkg gen /σ bkg where N bkg gen was chosen in our case to be 5 × 10 5 of generated MC background events and σ bkg is the cross section calculated by Pythia. The luminosity of the data is the fixed integrated luminosity, L data , and is given in table for each type of dataset. Although negligible, the error of the quantity Z due to the size of the MC samples is propagated and reported in the plots using, where the uncertainties on the number of signal δs and background δb events are taken to be the Poisson distributed errors s MC and b MC , respectively.
We present the comparison between the relevant trigger scenarios for three of the five datasets for Run in figure . We omit the scouting and heavy ion datasets since only the dimuon trigger scenario applies to these datasets. The results for Run are given separately in figure of appendix A.
--For m µµ ≥ 7 GeV the single muon trigger scenario leads to the highest significance, followed by the hybrid trigger scenario. This renders the dimuon trigger scenario the least competitive for these masses. For signals with a large invariant mass m µµ , our observation suggests the importance of the high p T threshold that the single muon and hybrid trigger scenarios impose on the muon candidates in comparison to the dimuon trigger scenario. On the other hand, for m µµ ≤ 6 GeV the hybrid trigger scenario proves the most competitive favouring the combination of the high p T threshold along the displacement of both muon tracks.
At this point, it is worth reminding that both the single muon and the hybrid trigger scenarios impose the same p T threshold on the muon candidates. Nevertheless, the former requires one displaced muon track, while the latter requires the displacement of both tracks. This remark suggests why the hybrid trigger scenario is favoured over the single muon trigger scenario in the low-mass regions. The resonance decay in the low-mass region is long-lived and induces a high displacement of both tracks. In comparison, the decays in the high-mass region are prompter. Requiring the displacement of both muon tracks in the high-mass region induces a loss of statistics arising from discarding those events in which both tracks are produced promptly.
Having discussed the roles of the different trigger scenarios in reconstructing the dimuon signal, we now present figure to compare the potential of each dataset in accessing such signal. It is clear from figure that the low-pileup and heavy ion datasets demonstrate the least competition in accessing the dimuon signal. This is indeed expected due to the small amount of collected data compared to the other three datasets. For the dimuon trigger scenario, the scouting dataset partially outperforms the conventional LHC pp dataset. We attribute this dominance of the scouting dataset to the privileged low-p T thresholds of the scouting triggers along the large amount of collected data. Additionally, the parking dataset is competitive with the standard pp dataset in the high-mass region.
We conclude this section by summarising in figure all the LHC Run results presented in figures and together with the complete overview of the Run results. A detailed presentation of the Run results is given in appendix A. Here we remark that the different expected scaling for the Run luminosities enables the scouting and parking datasets to outperform the standard pp datasets in the high-mass region.

Summary and conclusion
We have compared different types of datasets collected at the LHC experiments using a simple model featuring a displaced vertex signature. We have shown that the scouting and parking datasets can be competitive with the standard LHC runs for such signatures in the mass region under consideration. In contrast, the low-pileup and heavy ion datasets are limited to considerably smaller significance.
We stress that although these results have been obtained for one particular signature of a specific model, its general conclusions may be generalised to some extent. We caution the reader that many assumptions have been made in this study and that Delphes is not meant -- Comparison of the significance to discover the displaced low-p T signal in arbitrary units as a function of the dimuon invariant mass. The comparison is shown for three of the five datasets: parking (a), standard pp (b), and low-pileup (c) for the three different triggers scenarios during Run . Since only the dimuon trigger applies to the scouting and heavy ion dataset, we omit these datasets here.
to be an accurate detector simulation. Therefore, some subtleties ignored in this paper may alter the relative ranking of some of the datasets considered. However, our results indicate that scouting and parking can be very promising strategies for a large category of signals characterised by low particle masses. Therefore, it may be worthwhile allocating significantly larger bandwidth for those datasets than in previous runs. Moreover, our results encourage further exploitation of the unconventional datasets that have already been collected during previous LHC runs. We hope to see new studies that valorise these data, which may also be used as a testing ground for new ideas towards optimising the corresponding triggers for future runs.

Acknowledgments
First of all, we thank Marco Drewes for contributing to the initial idea of this paper and critically accompanying its development. Furthermore, we thank Olivier Mattelaer for the original implementation and continuous support for heavy ion collision simulation in MadGraph. Pavel Demin and Michele Selvaggi's assistance with the Delphes software was pivotal in completing this project. Thanks to Christian Bierlich for discussions concerning Pythia, particularly on using the Angantyr model. We are grateful for our early talks with Simon Knapen, Steven Lowette, and Hardik Routray. HF thanks Ken Mimasu for the technical discussions on the ALPsEFT model. We received essential clarifications about scouting and parking in CMS from Greg Landsberg and Maurizio Pierini. Finally, we thank the IT team of CP for their assistance with the supercomputing facilities.  . and by the Walloon Region.

A Run results
This section presents the results for Run of the LHC. In figure , we show the comparison between the trigger scenarios for each of the datasets; in figure , we show the comparison between the datasets for each of the trigger scenarios. The corresponding integrated luminosities of the different types of datasets are given in table .

B Codes and cards
This section presents the code used to generate the events used in this analysis.
Signal The ALP signal presented in figure is generated in the s-channel and decayed to two muons using MadGraph define p = g u c d s b u~c~d~s~bp p > ax > mu+ mu− set mb 0 set ymb 0 In order to use the FS the mass of the bottom quark and its coupling are set to be zero.