Search for light long-lived neutral particles that decay to collimated pairs of leptons or light hadrons in $pp$ collisions at $\sqrt{s}=13$ TeV with the ATLAS detector

A search for light long-lived neutral particles with masses in the $O$(MeV-GeV) range is presented. The analysis targets the production of long-lived dark photons in the decay of a Higgs boson produced via gluon-gluon fusion or in association with a $W$ boson. Events that contain displaced collimated Standard Model fermions reconstructed in the calorimeter or muon spectrometer are selected in 139 fb$^{-1}$ of $\sqrt{s} = 13$ TeV $pp$ collision data collected by the ATLAS detector at the LHC. Background estimates for contributions from Standard Model processes and instrumental effects are extracted from data. The observed event yields are consistent with the expected background. Exclusion limits are reported on the production cross-section times branching fraction as a function of the mean proper decay length $c\tau$ of the dark photon, or as a function of the dark-photon mass and kinetic mixing parameter that quantifies the coupling between the Standard Model and potential hidden (dark) sectors. A Higgs boson branching fraction above 1% is excluded at 95% CL for a Higgs boson decaying into two dark photons for dark-photon mean proper decay lengths between 10 mm and 250 mm and dark photons with masses between 0.4 GeV and 2 GeV.


Introduction
beam-dump and fixed-target experiments [28][29][30][31][32][33][34][35][36][37][38], + − colliders [39-47], electron and muon anomalous magnetic moment measurements [48][49][50] and astrophysical observations [51,52]. Given the various constraints, a displaced dark-photon decay with a kinetic mixing parameter < 10 −5 is allowed for d masses greater than 10 MeV. In this paper, the sensitivity to displaced DPJs is significantly higher than in the previous ATLAS search [14], due to the higher integrated luminosity of the dataset and to new analysis methods. The new analysis methods include updated signal-region selection criteria and DPJ reconstruction techniques. Multivariate techniques based on convolutional neural networks, exploiting the three-dimensional representations of calorimeter energy deposits associated with jets, and a dense neural network, using muon track information, are employed to indentify DPJ candidates. The addition of a new set of event selection criteria, targeting events where the dark photons arise from the decay of a scalar particle produced in association with a boson, significantly increases the sensitivity in scenarios with d mean proper decay lengths below (above) 3 (40) mm.
designed to target the displaced d decays, described in detail in Section 5.1, or single-lepton triggers, with requirements on the identification, isolation, and T of the leptons to maintain efficiency across the full momentum range while controlling the trigger rates [57,58].
During collisions data-taking the LHC circulates two counter-rotating proton beams constructed from bunches of protons. However, following LHC injection not all bunch slots are filled with protons, with the number of unfilled bunches depending on the LHC filling scheme [59]. An empty bunch crossing takes place when neither beam is filled with protons and each empty bunch is separated from filled bunches by at least five empty bunches on each side. A dataset enriched in cosmic-ray muon background is collected during empty bunch crossings (the 'cosmic dataset') and used for the background estimation.
Monte Carlo (MC) simulated event samples are used to model the BSM signals. Signal samples modelling the production of dark photons via a 125 GeV Higgs portal were generated with M -G 5_ MC@NLO 2.2.3 [60] interfaced to P 8.186 [61] for the parton showering (PS) and hadronisation. The matrix-element calculation was performed at tree level. The parton distribution function (PDF) set used for the generation is NNPDF2.3 [62]. Higgs boson production via gluon-gluon fusion and in association with a boson is included. The predicted Standard Model cross-sections for these two processes, assuming = 125.09 GeV, are respectively 48.61 pb [63,64] and 1.369 pb [65,66]. A second set of signal samples was generated, modelling the production via gluon-gluon fusion of a high-mass (800 GeV) Higgs-like scalar mediator with the same decay modes as in the 125 GeV mass case. Effects of higher-order QCD corrections on the T of the Higgs boson, evaluated using a reweighting procedure [67], change the signal selection efficiency by less than the MC statistical accuracy and are therefore neglected. A dark photon with a mass d up to a few GeV that mixes kinetically with the SM photon will decay into leptons or light quarks, with branching fractions that depend on its mass [6,8,9]. Taking as an example a dark-photon mass of 0.4 GeV, the d decay branching ratios are expected to be 45% + − , 45% + − , and 10%¯ [6]. The mean proper decay length of the d is treated as a free parameter. In the generated samples, was chosen such that, accounting for the boost of the d , a large fraction of the decays occur inside the sensitive ATLAS detector volume (i.e. up to 7 m in radius and 13 m along the -axis from the centre of the detector). The decays of the Higgs boson into dark photons through dark fermions or directly into two dark photons were simulated at matrix-element level during the generation. In the FRVZ model, the mass of d was chosen to be small relative to the Higgs boson mass, and far from the kinematic threshold at HLSP + d = d . The values of these masses have a negligible impact on the analysis results. In the HAHM, the Higgs boson decays directly into two dark photons.
The SM background is estimated using data-driven techniques, with MC simulated events used to aid in the validation, the evaluation of uncertainties, and the training of dedicated multivariate classifiers. SM processes that could be potential sources of background include multi-jet, W+jets, Z+jets,¯, single-topquark, WW, WZ, and ZZ production. The multi-jet samples were generated with P 8.210 [68] using the same set of tuned parameter values (tune) and PDF as for the signal samples. Samples of W+jets, Z+jets, WW, WZ, and ZZ events were generated using S 2.2.1 [69] with the NNPDF3.0 [70] PDF set. Single-top and¯MC samples were generated using P B v2 [71] and P 8.230 with the A14 tune [72] for parton showering and hadronisation, and the NNPDF2.3 set of PDFs.
Finally, MC samples of / → events were generated and used to evaluate systematic uncertainties for muon trigger and reconstruction efficiencies. The MC samples were generated using P -8+P ++ [73] with the A14 tune for parton showering and hadronisation, and the CTEQ6L1 [74,75] PDF set.
All generated Monte Carlo events were processed through a full simulation of the ATLAS detector geometry and response [76] using the G 4 [77] toolkit. The simulation includes multiple interactions per bunch crossing (pile-up), as well as the detector response to interactions in bunch crossings before and after the one producing the hard interaction. In order to model the effects of pile-up, simulated inclusive events were overlaid on each generated hard-scatter event and reweighted to match the conditions of the 2015-2018 data sample. The inclusive events were simulated with P 8.210 with the A3 tune [78] and the NNPDF2.3 set of PDFs.

Event reconstruction
Candidate events are required to have a reconstructed vertex [79] with at least two associated tracks with transverse momentum ( T ) larger than 500 MeV that are consistent with originating from the beam collision region in the -plane. The primary vertex in the event is the vertex with the highest sum of squared transverse momenta of associated tracks.
Electron candidates are reconstructed from isolated electromagnetic calorimeter energy deposits matched to ID tracks. They are required to have | | < 2.47, a transverse momentum T > 20 GeV, and to satisfy the 'TightLHElectron' requirement defined in Ref. [80], which is based on a likelihood evaluated using measurements of shower shapes in the calorimeter and track properties in the ID as input variables. Candidates within the transition region between the barrel and endcap electromagnetic calorimeters, 1.37 < | | < 1.52, are not considered.
Muon candidates are reconstructed in the region | | < 2.5 from MS tracks matching ID tracks. They are required to have T > 20 GeV and satisfy the 'medium' identification requirements defined in Ref. [81]. These requirements are based on the number of hits in the different ID and MS subsystems, and on the ratio of the charge and momentum ( / ) measured in the ID and MS divided by the sum in quadrature of their corresponding uncertainties.
Isolation criteria are applied to electrons and muons. The scalar sum of the T of ID tracks within a variable-size cone around the electron (muon), must be less than 15% of the lepton T , excluding tracks associated with the lepton. The track isolation cone size for electrons (muons) Δ = √︁ (Δ ) 2 + (Δ ) 2 is given by the smaller of Δ = 10 GeV/ T and Δ = 0.2 (0.3). In addition, the sum of the transverse energy of clusters of calorimeter cells [82] in a cone of Δ = 0.2 around the electron (muon) must be less than 20% (30%) of the lepton T , excluding clusters associated with the lepton.
Jets are reconstructed from three-dimensional energy clusters in the calorimeter [83] using the anti-jet clustering algorithm [84,85] with a radius parameter = 0.4. Only jet candidates with T > 20 GeV and | | < 4.9 are considered. Jets are calibrated using MC simulation, with corrections obtained from in situ techniques [86]. To reduce the effects of pile-up, jets with T < 120 GeV and | | < 2.5 are required to have a significant fraction of their associated tracks compatible with originating from the primary vertex, as defined by the jet vertex tagger [87] (JVT). This requirement reduces the fraction of jets from pile-up to 1%, with an efficiency for hard-scatter jets of about 90%. Jets not satisfying basic quality criteria designed to reject detector noise and non-collision backgrounds [88] are then discarded.
Jets containing -hadrons ( -tagging) are identified with a multivariate discriminant that makes use of track properties [89,90]. Jets are considered to be -tagged if they fulfil a requirement that has 70% average efficiency for jets containing -hadrons in simulated¯events. The rejection factors for light-quark and gluon jets, jets containing -hadrons, and hadronically decaying -leptons in simulated¯events are approximately 301, 38, and 8, respectively.
Simulated events are corrected for differences from collision data in -tagging efficiencies and -tagging mis-tag rates [90][91][92]. Corrections are also applied to account for minor differences between data and MC simulation in the single-lepton trigger, reconstruction, identification and isolation efficiencies [93,94].
Jet candidates within an angular distance Δ = √︁ (Δ ) 2 + (Δ ) 2 = 0.2 of a lepton candidate are discarded. Remaining lepton candidates within Δ = min{0.4, 0.04 + T ( )/(10 GeV)} of a jet are then discarded to suppress bottom and charm hadron decays. When considering muons, if the jet has fewer than three associated tracks, the muon is retained and the jet is discarded instead to avoid inefficiencies for high-energy muons undergoing significant energy loss in the calorimeter. Finally, any electron candidate sharing an ID track with a remaining muon candidate is also removed.
The missing transverse momentum vector p miss T , whose magnitude is denoted by miss T , is defined as the negative vector sum of the transverse momenta of all identified electrons, muons and jets, plus an additional soft term. The soft term is constructed from all tracks that originate from the primary vertex but are not associated with any identified lepton or jet. In this way, the miss T is adjusted for the best calibration of leptons and jets, while contributions from pile-up interactions are suppressed through their exclusion from the soft term [80,81].
A displaced d decay producing a pair of muons is expected to leave two or more collimated stand-alone MS tracks, referred to as a muonic dark-photon jet DPJ. Stand-alone MS tracks [95] are reconstructed in the pseudorapidity region | | ≥ 0.1 and formed by requiring at least two matched segments in the MS, and are fit with a primary vertex constraint. Candidates with pseudorapidity in the range 1.0 ≤ | | ≤ 1.1 are rejected to avoid the transition region of the MS between barrel and endcap. Moreover, only stand-alone MS tracks in the pseudorapidity interval | | < 2.4, corresponding to the ID coverage, are selected to allow an isolation variable based on ID tracks to be computed. Stand-alone MS tracks are required to not match any prompt muon candidate, in order to discard muons coming from the main interaction vertex.
A d decaying into a displaced electron or quark pair leads to energy deposits in the calorimeters reconstructed as a single jet with low EM fraction (EMF), defined as the ratio of the energy deposited in the EM calorimeter to the total jet energy. Jets with EMF below 0.4 are referred to as calorimeter dark-photon jet caloDPJ. MC simulations show that DPJs containing two dark photons both decaying into an electron or quark pair are reconstructed as a single jet. Low-EM-fraction jets are reconstructed and calibrated with the same algorithms as described previously. They are, however, only considered if they have T > 20 GeV and lie within | | < 2.5. Furthermore, to retain high efficiency for the targeted signals, they are required to satisfy quality criteria looser than those in the main jet selection and no JVT selection is applied. To avoid selecting events where most of the energy associated with a jet could have been produced by localised noise, events are rejected if the leading jet has >90% of its energy associated with a single constituent cluster or layer within the LAr calorimeter. Potential background from noise bursts [96] in the LAr calorimeter is rejected via a veto on events where the leading jet's largest energy deposit is located in the EM calorimeter endcap, where most noise bursts occur. Noise-induced jets in the hadron calorimeters are removed by imposing the BadLoose cleaning selection, described in Ref. [88], without the cuts on the fraction of jet energy deposited within the electromagnetic calorimeter and the jet charged-fraction. These cleaning requirements reject approximately 0.8% of low-EM-fraction jets in the signal samples used in this analysis. Finally, the time caloDPJ associated with a caloDPJ, measured as the energy-weighted average of the timing for each calorimeter cell related to the jet and corrected by the corresponding time-of-flight from the interaction point, is required to fall within a window of 4 ns around zero. CaloDPJs from cosmic-ray muons and beam-induced backgrounds (BIB) would have a different jet-timing distribution than collision products.
Dedicated algorithms for the identification of DPJs have been developed to target d decays into DPJs or caloDPJs. These algorithms are described in Sections 4.1 and 4.2 respectively.

Muonic dark-photon jets
Muonic dark-photon jets are reconstructed using the Cambridge-Aachen clustering algorithm [97] that combines all the selected stand-alone MS tracks lying within a cone of fixed size in ( , ) space. The algorithm starts from the highest-T stand-alone MS track and searches for additional stand-alone MS tracks within a Δ = 0.4 cone around the initial track's momentum vector. If a second stand-alone MS track is found in the cone, the axis of the cone is rotated to the vector sum of the momenta of the two tracks, and the search is repeated until no additional tracks are found in the cone. The DPJs are required to have at least two MS tracks and are discarded if a jet is found within Δ = 0.4 of a DPJ, to ensure orthogonality between reconstructed DPJ types.
Cosmic-ray muons that cross the detector in time coincidence with a interaction constitute the main source of background to the DPJ. The cosmic dataset is used to study this background and to train a dense neural network (DNN), referred to as the cosmic-ray tagger, to discriminate signal DPJs from the DPJ candidates that originate from the cosmic-ray background. For optimal training, a balanced mixture of all available MC signal samples is used. The DNN is implemented using Keras with the Tensorflow backend [98] and it is trained to classify each stand-alone MS track constituting a DPJ using the following quantities: the longitudinal impact parameter 0 , the track angular direction (in and ), and the timing measurements from the MS. The time measurements in different stations of the MS, when available, allow the muon's direction of flight to be identified. A parametric training method is used to only consider the timing measurements when they are available. The neural network has three dense hidden layers, with 32, 64 and 128 neurons respectively. An output layer with a sigmoidal activation function returns a binary classification score between 0 and 1, which can be interpreted as the probability for a DPJ constituent track to originate from a d decay, and example distributions are shown in Figure 2. The figure also illustrates the small dependency on the specific choice of signal model. A DPJ is accepted if all its stand-alone MS track constituents have a cosmic-ray tagger output score > 0.5. This selection was optimised to retain a high signal efficiency: signal DPJs are selected with an efficiency above 95% for transverse decay lengths up to 5 m and for d T larger than 20 GeV and with a background rejection of 90%.
The DPJ reconstruction efficiency is shown in Figure 3, for DPJ objects that satisfy the cosmic-ray tagger selection, as a function of the and transverse momentum of the dark photon in a few benchmark signal scenarios. The efficiency mainly depends on the opening angle between the muons from the d decay, thus models with a small generated and a large dark-photon boost will be more efficiently reconstructed. A drop in efficiency is expected for d decays which occur after the middle layer of the MS (6 m radius in the barrel region), where muons can no longer be reconstructed.

Calorimeter dark-photon jets
All the low-EM-fraction jets satisfying the selection described in Section 4 are considered as calorimeter dark-photon-jet candidates. Candidates in the transition region between the barrel calorimeters and the endcap cryostat are removed by requiring the fraction of energy in the Tile Gap scintillators to be less than 10% of the total jet energy.
The main source of background for caloDPJs is jet production in multi-jet events. In order to reduce this background, a dedicated discriminator (QCD tagger), based on a convolutional neural network implemented using Keras with the Tensorflow backend, is used to assign a score to each caloDPJ in the event. The training of the neural network is based on simulated events and exploits caloDPJs reconstructed in signal and multi-jet MC events. The QCD tagger inputs are three-dimensional representations of energy deposits associated with the jet. The energy deposits are defined by collections of calorimeter cell clusters used in jet reconstruction [82]. Each collection has ( , ) coordinates and holds information about the total amount of energy deposited in each of the calorimeter samplings.
The ( , ) space around the jet axis is divided into a 15 × 15 grid, centred on the jet axis and corresponding to an × = 0.9 × 0.9 area, so that each cell cluster within this 2D grid corresponds to a 0.06 × 0.06 portion of the ( , ) space. A third axis is added to this grid, to take into account the EM and HCAL calorimeters' samplings as an additional coordinate. The resulting 3D grids are composed of cell clusters which contain the total energy released at the corresponding ( , ) coordinates and calorimeter sampling. In order to exploit the full calorimeter volume, three 3D grids are produced: one accounting for the barrel samplings, one for the tile calorimeter's extended barrel and one for the endcap. These are then used as input to a convolutional neural network. The output layer of the neural network has a sigmoidal activation function resulting in a binary classification in the [0, 1] range, as shown in Figure 4(a).
CaloDPJs are defined to be candidates if they have a QCD tagger score larger than 0.5. This selection was optimised to retain a high signal efficiency and corresponds to a background rejection of approximately 94%. The QCD tagger has a signal efficiency above 95% for above 2.5 m, which decreases to about 60% for d decaying within the ID. The efficiency was also found to be independent of the d T for a T larger than 20 GeV.
Muons arising from BIB and crossing the detector longitudinally, at radial distances > 2 m, can deposit energy in the HCAL by radiative losses, which can be reconstructed as caloDPJs owing to the resulting low EM fraction. To reduce the residual contamination from misidentified caloDPJs from BIB, a dedicated per-jet tagger (BIB tagger) was developed. This tagger uses the same strategy as the QCD tagger, exploiting the topology of the energy deposits to classify the candidate jets. The BIB tagger shares the network architecture of the QCD tagger. To remove any possible energy dependence and rely only on geometrical information, the input tensors are preprocessed by substituting the cluster energies with a fixed value. The jet ( , ) coordinates are passed as inputs to the dense neural network, in addition to the convolution layers' output. CaloDPJs from a BIB-enriched data selection and from all available MC signal samples are used as input for the training. The BIB-enriched data selection uses a dedicated calorimeter trigger exploiting cell timing and position to select BIB events. This trigger's performance was validated on BIB events occurring during empty bunch crossings. CaloDPJs are accepted if the output score of the BIB tagger, shown in Figure 4(b), is larger than 0.2, corresponding to a BIB rejection rate of 68%. This value is chosen in order to keep the signal efficiency above 80%, irrespective of the signal scenario, for events entering the analysis selections.
The caloDPJ reconstruction efficiency is shown in Figure 5 for caloDPJ candidates that satisfy the QCD tagger selection but no BIB tagger requirement, as a function of the transverse decay length and transverse momentum, for several benchmark signal samples. A d decaying into hadrons or electrons is reconstructed only within the HCAL volume (at radii of 2.28-4.25 m in the barrel region).

Event selection and background estimation
The events are classified into the two exclusive search categories, gluon-gluon fusion (ggF) and WH associated production, based on their charged-lepton multiplicity. The ggF category targets the Higgs boson production mode where events have no charged-lepton candidates satisfying the selections described in Section 4. The WH category targets Higgs boson production in association with a boson and requires exactly one charged lepton. At least one DPJ satisfying the selection criteria described in Section 4 is required in the selected events. If more than two DPJs are reconstructed, the one with the highest transverse momentum, called the 'leading DPJ', and the one farthest in Δ from the leading one, called the 'far DPJ', are used to classify the event. Each search category further separates the events into different orthogonal search channels based on the numbers of DPJs and caloDPJs, resulting in a total of six signal regions (SRs) that were optimised for the best discovery sensitivity and are described in detail in Section 5.1 and Section 5.2.
The main sources of background for the DPJ signals are cosmic-ray muons for the DPJs and jet production for both the caloDPJs and DPJs. Cosmic-ray muon background for the DPJs is strongly reduced after the event selection, and the main residual background is found to consist of hadronic or electromagnetic showers which reach the MS after not being fully contained in the calorimeter (punch-through jets) and originate from multi-jet events. A subdominant background for both DPJ types is BIB. All background contributions are estimated from data. For the different decay modes considered, dedicated sets of the following discriminating variables are used to separate the BSM signal from the backgrounds or in the definition of regions used to aid the estimation of the background contributions: • jj : the invariant mass of the two leading jets.
• |Δ caloDPJs |: the absolute time difference between a pair of caloDPJs. This quantity is useful for further rejecting contributions from cosmic-ray muons and BIB, as these caloDPJ candidates do not originate from a single interaction vertex.
• JVT: the JVT score of a caloDPJ can be used to reject candidates that are likely to originate from the primary proton-proton interaction vertex.
• Δ DPJ : the azimuthal angle between the leading DPJ and far DPJ. Signal events are expected to contain anti-aligned DPJs.
• Δ =0.5 T : the scalar sum of the transverse momenta of all tracks within a Δ = 0.5 cone around the direction of the DPJ momentum vector. Displaced DPJs are expected to have very little nearby track activity in the ID.
• max( T ): the largest of the Δ =0.5 T values for the two DPJs in an event. • QCD tagger: the product of the QCD tagger scores of the two DPJs, or the single QCD tagger score when only one caloDPJ is available.
• T : the transverse mass in events, computed as T = where Δ is the azimuthal angle between the missing transverse momentum vector and the lepton, is exploited to reduce possible background contributions from multi-jet processes with a misidentified or non-prompt lepton.
• caloDPJ width ( ): the T -weighted sum of the Δ between each energy cluster of the jet and the jet axis. Jets from DPJs are expected to be narrower, on average, than ordinary jets since they are produced just before or inside the calorimeters.
• min(Δ ): smallest azimuthal angular separation between a selected DPJ and the p miss T vector.
• min(QCD tagger): the minimum QCD tagger score, computed for up to two caloDPJs in an event.
The detailed selection requirements, together with the dedicated methods for the estimation of the residual background contributions, are described in Section 5.1 for the ggF selection and in Section 5.2 for the WH selection.

Gluon-gluon fusion selection
In this selection, events are accepted if they satisfy one or more of three dedicated triggers targeting displaced objects. The trigger strategy is analogous to that described in Ref. [14]. It comprises two MS-based triggers and one calorimeter-based trigger.
The first MS-based trigger, referred to as tri-muon MS-only [57], accepts events with at least three muons with T ≥ 6 GeV reconstructed using only MS information. The second MS-based trigger, referred to as muon narrow-scan [99], requires a muon candidate from the L1 trigger with T ≥ 20 GeV to be confirmed by the HLT using only MS information. At the HLT a 'scan' is then performed in a cone of Δ = 0.5 around this muon, looking for a second muon reconstructed using only MS information. The T requirement on the second muon was increased from 6 GeV to 15 GeV during the course of the 2015-2016 data taking and kept constant afterwards. Both muons were required to be not matched to any track in the ID, and track isolation was required for the leading muon. During the 2017-2018 data taking, the narrow-scan trigger was extended to take advantage of the L1 topological trigger [100] by requiring partially matched topological items. The leading L1 muon candidate was then combined with an HLT object: either an unmatched muon, a jet, or transverse momentum imbalance. The last two were added to target events with one DPJ and one caloDPJ. The narrow-scan triggers have efficiencies of 75% and 40%, independent of the dark photon mass, for the 2015-2016 and 2017-2018 versions, respectively, when considering events with d decays into muon pairs with | | < 1.0 and below 6 m.
A L1 calorimeter trigger called the CalRatio trigger [99] is designed to select narrow jets produced in the hadronic calorimeter. During 2015-2016, the CalRatio L1 trigger selected narrow jets with transverse energy T > 60 GeV in a 0.2 × 0.2 (Δ × Δ ) region of the EM and HCAL. During 2017-2018 data taking, the CalRatio trigger instead used a L1 topological trigger [100] to select jets in the HCAL that have T > 30 GeV and are isolated from energy deposits in the EM calorimeter with T > 3 GeV within Δ = 0.2 around the most energetic HCAL energy deposit. At the HLT, jets from both of these L1 selections were required to have T ≥ 30 GeV, | | < 2.4 and EM fraction < 0.06. These jets were also required to have no tracks with T ≥ 2 GeV within Δ = 0.2 of the jet axis, and to satisfy BIB-suppression requirements on calorimeter cell timing and position. The CalRatio triggers used in 2015-2016 and 2017-2018 are found to have constant efficiencies of approximately 5% and 20%, respectively, for FRVZ signal events where the d is produced in the decay of a 125 GeV Higgs boson and decays into + − orw ithin | | < 1.0. These efficiencies rise to 25% and 30%, on average, for signal events in HAHM models, or to 55% and 45%, on average, when assuming d production via an 800 GeV scalar mediator.
Events where the two leading jets have an invariant mass jj > 1 TeV and miss T > 225 GeV are removed to allow for a statistically independent study of vector-boson fusion (VBF) production modes in the future.
The events satisfying the requirements listed above are further separated into orthogonal search channels based on the DPJ types. Events with pairs of DPJs (caloDPJs) are targeted by SR ggF 2 (SR ggF 2c ), while events with one DPJ of each kind are targeted by SR ggF c+ . Each of these SRs includes additional selection requirements, as summarised in Table 1 together with the triggers employed. Events with a single reconstructed DPJ are not considered for the ggF selection as the constraints they provide are not competitive due to the large background yields. The impact of beam-induced background is suppressed by requiring caloDPJ candidates to pass the BIB tagger selection and, only in the SR ggF 2c where two caloDPJs are reconstructed, to have a time difference |Δ caloDPJs | smaller than 2.5 ns. In SR ggF 2c and SR ggF c+ , the leading and far DPJs are further required to have an azimuthal angular difference Δ DPJ larger than 0.5. Multi-jet background events are rejected by requiring the DPJs to satisfy ID isolation criteria. DPJ candidates are required to have a Δ =0.5 T below 4.5 GeV. - The main sources of background after the SR selections are punch-through jets from rare multi-jet events for SR ggF 2 and multi-jet production for SR ggF 2c and SR ggF c+ . The second leading background for all SRs is the cosmic-ray muon background.
In order to estimate the background contribution to each SR, a data-driven 'ABCD' method is used. This method relies on the assumption that the distribution of background events can be factorised in the plane of two uncorrelated variables so that it is subdivided into four regions: A, B, C, and D. The number of background events in the SR can be evaluated as A = D × B / C . Any possible signal leakage outside the SR region is accounted for by using a modified ABCD method that simultaneously fits the signal and background events, taking into account potential signal contamination in the ABCD estimate.
The planes used in the estimation of the backgrounds for the three ggF SRs are formed by max( T ) and either Δ DPJ in SR ggF 2 , or QCD tagger in SR ggF 2c and SR ggF c+ . The definition of the three sets of ABCD control regions are given in Table 2. In the ABCD convention, region A is the SR and regions B, C and D are respectively the control regions CRB, CRC and CRD.
The cosmic-ray muon background contribution is estimated first, using the cosmic dataset. The ratio of the number of filled bunch crossings to the number of empty bunch crossings is used to scale the number of events in the cosmic dataset to that in the collision data. The value of this correction depends on the trigger used in the SR selection and ranges from a minimum of 2.2 for the CalRatio trigger in 2015-2016 to a maximum of 3.9 for the tri-muon MS-only trigger in 2018. Only three regions were found to have a non-negligible cosmic-ray muon background contribution: 7 ± 5 events in SR ggF 2 , 7 ± 4 events in SR ggF 2c and 14 ± 7 events in CRD ggF 2c . Since the events forming the cosmic dataset are collected in empty bunch crossings, there is no significant activity in the ID, which implies that the T is 0 for all reconstructed DPJs. For this reason, events populate only the SRs and CRD. These estimated cosmic-ray muon backgrounds are subtracted from the observed data in the main dataset before performing the ABCD procedure.
Events with BIB energy deposits are very likely to have two caloDPJs which can contribute to the background. The possible contribution from misidentified caloDPJs was studied using the BIB-enriched dataset. This selection is thus orthogonal to the signal regions. The efficiency of the SR and CR selection requirements when applied to BIB-enriched events was found to be less than 2.3 · 10 −4 , reducing the BIB contamination to a negligible contribution. The BIB background is hence neglected. Figure 6 shows the distribution of data events in the ABCD plane for the three search channels, together with the expected distribution for a benchmark FRVZ model assuming a 125 GeV Higgs boson and a decay branching fraction to the dark sector of 10%. The acceptance times efficiency for the FRVZ signal processes after applying all selection criteria is 0.74%, 0.02% and 0.15%, in the case of a d mass of 400 MeV and a generated of 50 mm, for the signal regions SR ggF 2 , SR ggF 2c and SR ggF c+ respectively.
The background estimation procedure was validated by applying the ABCD method in subregions constructed in combinations of the regions CRB, CRC and CRD, for varying choices of the subregion boundaries. For each validation test, a new set of (ABCD) test regions was defined within the union of either CRB and CRC or CRC and CRD. The test region boundaries were varied in discrete steps after dividing the allowed range into ten equal-size intervals. The validation was performed for each of the SR ggF 2 , SR ggF 2c and SR ggF c+ selections. The expected background yields in the resulting test SRs range between 15 and 800 events. An additional validation of the method was performed by checking the closure on orthogonal selections, which were obtained either by inverting the cosmic-ray tagger cut or by selecting different ranges of QCD tagger score. These selections were chosen in order to have negligible signal contributions. In all the channels, the observed yields were found to agree within one standard deviation with the expected yields. The Pearson linear correlation coefficient between the two variables defining the ABCD plane in all subregions and in the validation regions was found to be less than 3%. The signal leakage in regions CRB, CRC and CRD, was found to be less than 10% of the total signal in the ABCD selection for all signal scenarios considered in the analysis.

WH associated production selection
In the WH selection, candidate events are required to pass single-electron or single-muon triggers. The offline requirements on the T , identification and isolation of the lepton are tighter than those applied online, so as to be on the trigger efficiency plateau [54].
The selection requires one electron or muon with T ≥ 27 GeV, miss T ≥ 30 GeV, T ≥ 40 GeV, and no additional lepton with T ≥ 10 GeV. Events where the two leading jets have an invariant mass jj ≥ 1 TeV are removed to allow for a future statistically independent study of VBF production modes. Contributions from single-top and¯processes are reduced by requiring three or fewer jets with T ≥ 30 GeV and no -tagged jets in each event. After these selections the background is dominated by W+jets events.
Events are then separated into orthogonal search channels based on the numbers of DPJs and caloDPJs. Events with exactly one caloDPJ are targeted by SR WH c , and events with two or more caloDPJs by SR WH 2c . The mixed channel with one DPJ of each type is targeted by SR WH c+ . Channels with only DPJs are not considered for the WH selection as the constraints they provide are not competitive with those from other signal production modes. The SR requirements are summarised in Table 3. In all SRs, caloDPJ candidates are required to have a JVT score below 0.6 to suppress jets originating from the interaction point. Events belonging to SR WH c , where one d is reconstructed, are required to have T ≥ 120 GeV and a caloDPJ with T ≥ 30 GeV to further reduce the initially overwhelming W+jets background. In this SR, the second d is expected to decay outside the detector volume and yield a higher miss T . CaloDPJ candidates are required to have a width ≤ 0.08 in SR WH c and ≤ 0.1 in SR WH c+ . In SR WH 2c the leading (far) caloDPJ candidate is required to have ≤ 0.1 (0.15). These selections are optimised to reject hadronic jets and maximise the discovery significance. Single-lepton trigger ( , ) yes yes yes After the selection the main background is due to W+jets events where rare QCD phenomena give rise to in-flight decays and punch-through jets that mimic signal DPJs. The residual W+jets background in the signal region is estimated with a data-driven ABCD method, the same as used for the ggF selection and explained in detail in Section 5.1. The non-collision background from cosmic-ray muons and BIB was found to be negligible for this selection.
The ABCD planes used in the estimation of the backgrounds for the three WH SRs are based on min(Δ ) and min(QCD tagger). These variables provide good discrimination between signal and background, and are highly uncorrelated in W+jets events. The definitions of the three sets of ABCD regions are given in Table 4.
All the SRs lie at high QCD tagger score and small azimuthal angular separation between the p miss T and the nearest DPJ, where most of the signal events are expected.
The background estimation procedure was validated by applying the ABCD method in subregions constructed in combinations of the regions CRB, CRC and CRD, for various choices of the subregion boundaries, as done for the ggF selection. The expected background yields in the resulting test SRs range between 80 and 1550 events. Agreement within one standard deviation between data and the predicted yields was observed in most regions, with discrepancies of up to two standard deviations appearing in the for values of min(QCD tagger) close to 0.9. An additional validation of the method was performed by checking the closure of the ABCD procedure on orthogonal selections, which were obtained by selecting different ranges of QCD tagger score. The validation ranges extend to a minimum of 0.6. In all the channels, the observed yields were found to agree within one standard deviation with the expected yields. The Pearson linear correlation coefficient between the two variables defining the ABCD plane in all subregions and in the validation regions was found to be less than 2%. Figure 7 shows the distribution of data events in the ABCD plane for the three search channels, together with the expected distribution for a benchmark FRVZ model assuming a 125 GeV Higgs boson. The acceptance times efficiency for the FRVZ signal processes after applying all selection criteria is 0.12% in the case of a d mass of 400 MeV and of 50 mm in SR WH c+ , and 0.05% and 0.26% in SR WH 2c and SR WH c , respectively, in the case of a d mass of 100 MeV and of 15 mm. The signal leakage in the CRs is accounted for in the fit and can be as high as 25% for CRB, and up to 15% for CRD and CRC. However, the stability of the fit is not affected by the potential signal leakage because the expected signal yields are negligible in comparison with the SM background in those regions.  Figures (a, b,

Systematic uncertainties
The overall uncertainty in the SR yields is dominated by the statistical uncertainty. Nevertheless, potential sources of experimental uncertainty are considered for the background estimates and the simulated signal yields.
The statistical uncertainties of the observed yields in regions CRB, CRC, and CRD are propagated to the background expectation obtained from the ABCD method. An additional uncertainty is assigned to account for the size of the cosmic dataset and scaled by the ratio of the number of empty bunch crossings to the number of filled bunch crossings while the triggers were active. This uncertainty is estimated to be as large as 80% of the expected cosmic-ray muon background, but has no impact on the final results because of the limited contribution from this specific background.
Experimental uncertainties in the reconstruction and simulation of the signal events are considered. For jets, these include jet energy resolution (JER) and jet energy scale (JES) uncertainties from the standard calibration scheme [86], which amount to up to 3% of the expected signal yields. An extra JES uncertainty is applied to take into account a possible dependence on the low-EM-fraction selection. It is estimated following the same procedure as used in the 2015-2016 dark-photon jet search [14] and is about 3%.
For muons, a systematic uncertainty in the singled reconstruction efficiency is evaluated using a tag-andprobe method applied to / → events in data and simulation, as / → decays were found to behave similarly to d decays in simulated events. The / reconstruction efficiency is evaluated in both data and simulation as a function of the opening angle Δ between the two muons. For low Δ values, the efficiency decreases due to the difficulty of reconstructing two tracks with small angular separation in the MS. The difference in / → reconstruction efficiency between simulation and data in the Δ interval between 0 and 0.06, where the DPJ samples are concentrated, amounts to 9.6% and is taken as the uncertainty. This uncertainty includes a subdominant contribution of approximately 0.1% on the matching of MS and ID tracks.
The systematic uncertainties in the efficiency of the two dedicated muon triggers are evaluated using a tag-and-probe method applied to / → events in data and simulation. The difference between the trigger efficiency in data and that in simulation is evaluated as a function of the opening angle between the two muons. The difference in the region Δ < 0.05, corresponding to the Δ expected for signal, is taken as the uncertainty and is found to be 6% for the muon narrow-scan trigger and 5.8% for the tri-muon MS-only trigger.
The systematic uncertainty in the CalRatio trigger efficiency is taken from Ref. [101] and recomputed for each signal sample. It is estimated to be up to 4% for the heavy (800 GeV) scalar-mediator signal samples, and up to 30% for the 125 GeV Higgs signal samples.
The uncertainty associated with the MC modelling of the input variables used for the cosmic-ray tagger training is evaluated by comparing the distributions of the tagger's output scores for → events in data and MC samples. Due to the high boost of low-mass dark photons and the high T of signal muons, the muon 0 and timing distributions are similar to those of prompt muons from → decays. Muons from boson decays were reconstructed using only MS information. The ratio of the two score distributions is computed and applied as a binned scale factor to the muons used for DPJ reconstruction. The resulting relative variation of the final signal yield in the SR is taken as the systematic uncertainty. The same procedure is used for the QCD and BIB taggers, where the ratios of data to simulated distributions are computed from data and MC samples of multi-jet events after selecting signal-like events with reconstructed caloDPJs but failing the analysis trigger selections. In-depth modelling studies of jet calorimeter-cell clusters, upon which the taggers' training images are built, show no significant mis-modelling.
A pile-up modelling uncertainty is assigned to account for the difference between the predicted and measured inelastic cross-sections [102].
Finally, an uncertainty in the computed total integrated luminosity used to rescale the expected number of signal events is considered. The uncertainty in the combined 2015-2018 integrated luminosity is 1.7% [103], obtained using the LUCID-2 detector [104] for the primary luminosity measurements.
A summary of the experimental systematic uncertainties taken into account for the signal samples in this analysis is shown in Figure 8, assuming signal production via a 125 GeV Higgs boson. The average uncertainties are representative for the bulk of the signal samples, with small variations of the order of a few percent being observed as a function of the d mass. The uncertainty related to the efficiency of the CalRatio trigger was found to have the largest variation, ranging between 15% and 26% with a linear dependence on the d mass.

Results and interpretations
The data-driven background estimate in each SR is obtained by performing a maximum-likelihood fit to the yields in the four (i.e. ABCD) regions in data. The fitted likelihood function is formed from a product of Poisson functions, one for each of the SR, CRB, CRC, and CRD regions, describing signal and background expectations. The ABCD ansatz is introduced as nuisance parameters in the background component of the expected yield in each region. The likelihood-based ABCD fit is robust against control regions with a small number of events and takes into account possible signal contamination in the control regions. All systematic uncertainties described in Section 6 are included in the fit as nuisance parameters, parameterised with Gaussian probability density functions that multiply the fit likelihood. They are assumed to be uncorrelated across regions. An alternative correlation model, where the uncertainties are assumed to be fully correlated across regions, has a negligible impact on the results. The mean value of the Gaussian probability distribution function is constrained by the nominal value of the parameter and the variance is defined by the 68% confidence interval of the systematic uncertainty associated with the parameter.
The observed and expected numbers of events in the signal regions are summarised in Table 5. The reported yields are extrapolated by the fit assuming no signal, and with unblinded data in all ABCD regions in the fit. When comparing the results with a likelihood fit using blinded data in the SR, the background yields are found to be consistent at the percent level except for SR ggF 2 , where an 11% change is observed with a relative uncertainty reduction by 32%. For the ggF selection results, the estimated cosmic-ray muon yields are subtracted from each of the ABCD regions before using the ABCD method to estimate the multi-jet background yield.
The results of the search are used to set upper limits on the production cross-section times branching Table 5: Observed and expected yields in the ABCD regions. The total uncertainty in the background expectation is computed by the ABCD fit to unblinded data. In the ggF selection regions the estimated contribution from cosmic-ray muons is subtracted from each of the ABCD regions before the ABCD fit, and added back post-fit (as shown in this fraction, × , of the process → → 2 d + , with assumed to be undetected, as a function of the d mean proper decay length . The upper limit on the signal strength is obtained with the CL s method [105] with the asymptotic calculator [106], considering the background and the predicted signal yields from simulation in the four ABCD regions. The validity of the asymptotic approximation is checked against a full calculation using pseudo-experiments, and the CL s values of the two methods typically agree within 1%, and maximally within 5%. Model-dependent limits are computed for the various signal scenarios considered in the analysis. The hypothesis tests take account of the expected signal yield and its uncertainties in the CRs and SRs and are performed with a likelihood fit to the observed data. A summary of the observed limits for scenarios assuming the production of a 125 GeV Higgs boson that can decay into hidden-sector particles, resulting in two d in an FRVZ model, is shown in Figures 9 and 10 for the ggF and WH selections, respectively. The sensitivity of each search channel is reported in a separate subfigure. The sensitivity of the 2 and c+ selections to d masses less than twice the muon mass is due to signal decays into a displaced electron or quark pair within the muon spectrometer volume which are reconstructed as DPJs. For pairs of dark photons decaying into DPJs, branching fractions larger than 1% are excluded at 95% confidence level (CL) if the dark photons have a mean proper decay length between 10 mm and 250 mm and a mass between 0.4 GeV and 2 GeV. When considering dark-photon decays into pairs of caloDPJs, branching fractions larger than 10% are excluded at 95% CL for mean proper decay lengths between 2 mm and 3 mm and masses between 17 MeV and 50 MeV. Figure 11 reports the observed limits for scenarios assuming the production of a 125 GeV Higgs boson that can decay into two d in the HAHM model. The sensitivity of each search channel in the ggF selection is reported in a separate subfigure. The sensitivities in the WH selection are significantly lower because the CalRatio trigger is less efficient for those events. For pairs of dark photons decaying into DPJs, branching fractions larger than 1% are excluded at 95% CL if the dark photons have a mean proper decay length between 4 mm and 200 mm and a mass between 0.4 GeV and 2 GeV. When considering dark-photon decays into pairs of caloDPJs, branching fractions larger than 10% are excluded at 95% CL for mean proper decay lengths between 1.5 mm and 8 mm and masses between 17 MeV and 100 MeV.
Alternative signal scenarios were also considered in the context of FRVZ models. The sensitivity to scenarios where a pair of dark photons is produced in the decay of an 800 GeV scalar mediator is driven by the selection efficiency being higher than for the decay of the 125 GeV Higgs boson because of the harder dark-photon energy spectrum. The signal region targeting decays into one caloDPJ and one DPJ in the ggF selection was found to be the most sensitive: branching fractions above 10% are excluded at 95% CL if the dark photons have a mean proper decay length between 6 mm and 30 mm and a mass between 400 MeV and 2 GeV. In the case of models leading to a total of four dark photons in the final state, the sensitivity is lower than for the → 2 d + process because the large boost of dark-photon pairs from the d decays results in caloDPJs overlapping with DPJs and failing the object-level selections. The signal region targeting decays into one caloDPJ and one DPJ in the ggF selection was found to be the most sensitive: branching fractions above 10% are excluded at 95% CL for dark photons with a mean proper decay length between 20 mm and 50 mm and a mass between 400 MeV and 1 GeV. In order to set limits on the → → 2 d + process as a function of the kinetic mixing parameter and d mass, the best-performing search channels, based on their expected sensitivity, are taken into account. In the case of FRVZ models, the mutually exclusive search channels targeting dark-photon decays into caloDPJs are statistically combined to maximise the exclusion power. The combined fit considers a product of the likelihood functions of the individual search channels with independent parameters, but with a common Upper limits at 90% CL 3 are also set, in the context of the FRVZ model and the HAHM vector-portal model, in terms of kinetic mixing parameter and d mass and presented in Figure 13 for different Higgs decay branching fractions into d , ranging from 0.1% to 10%. The limits are interpolated between different masses by branching fraction variations [9] as a function of the d mass, corrected by a linear interpolation of the signal efficiency between adjacent available MC signal samples. For d mass below twice the muon mass, no coverage from DPJ signal regions is expected, motivating a significant drop in sensitivity. The combination of the caloDPJ signal regions provides the most stringent limit in this mass range. In the mass range above twice the muon mass, only the best-performing channel for each d mass probed is considered. The structures observed in this mass range depend on the expected d branching ratio, where decays to QCD resonances can significantly alter the sensitivities of the different search channels.
Exclusion regions from previous ATLAS searches for dark-photon jets are shown for the displaced-decay search [14] and for the complementary prompt-decay search [17]. The contributions from the caloDPJ channels allow ATLAS to set limits for < 0.1 GeV for the first time.
This search has significantly higher sensitivity than the previous ATLAS search for light long-lived neutral particles [14], reaching a 0.1% branching ratio for a 125 GeV Higgs boson decaying into dark photons. The solid black curve shows the observed exclusion limit from their statistical combination, and is almost identical to the expected limit. The green and yellow bands represent ±1 and ±2 uncertainty in the expected limit. The hatched band denotes the region in which the branching ratio of → 2 d + is larger than unity.   Figure 13: The 90% CL exclusion contours of the branching ratio ( ) for the decay → 2 d + as a function of the d mass and the kinetic mixing parameter for (a) the FRVZ model and (b) the HAHM model. These limits are obtained assuming branching fractions between 0.1% and 10% for Higgs boson decays resulting in dark photons. For d mass below twice the muon mass, the combination of the caloDPJ signal regions is used. The figure also shows regions excluded by the previous ATLAS searches for jets from displaced [14] (orange line) and prompt [17] (red line) decays of dark photons.

Conclusion
This paper presents a search for light long-lived neutral particles which decay into collimated pairs of fermions in the ATLAS detector at the LHC. Data collected from 139 fb −1 of √ = 13 TeV collisions during 2015-2018 were analysed for evidence of long-lived dark photons from Higgs boson decays and were found to be consistent with the background prediction. Upper limits on the Higgs boson branching fraction to dark photons as a function of their mass and mean proper decay length are reported, assuming SM Higgs boson production cross-sections. Branching fractions above 1% are excluded at 95% CL for Higgs boson decays to two dark photons with mean proper decay length between 10 mm and 250 mm and mass between 0.4 and 2 GeV. This analysis has significantly higher sensitivity than previous ATLAS searches for light long-lived neutral particles that decay to collimated pairs of fermions. The higher sensitivity is due to the addition of an event selection targeting long-lived dark photons from decays of Higgs bosons produced in association with a boson and to the use of a larger dataset and new methodologies for reconstructing dark-photon candidates and rejecting non-collision backgrounds.        [95] ATLAS Collaboration, Muon reconstruction performance of the ATLAS detector in proton-proton collision data at