Search for exotic decays of the Higgs boson into long-lived particles in $pp$ collisions at $\sqrt{s} = 13$ TeV using displaced vertices in the ATLAS inner detector

A novel search for exotic decays of the Higgs boson into pairs of long-lived neutral particles, each decaying into a bottom quark pair, is performed using 139 fb$^{-1}$ of $\sqrt{s} = 13$ TeV proton-proton collision data collected with the ATLAS detector at the LHC. Events consistent with the production of a Higgs boson in association with a leptonically decaying $Z$ boson are analysed. Long-lived particle (LLP) decays are reconstructed from inner-detector tracks as displaced vertices with high mass and track multiplicity relative to Standard Model processes. The analysis selection requires the presence of at least two displaced vertices, effectively suppressing Standard Model backgrounds. The residual background contribution is estimated using a data-driven technique. No excess over Standard Model predictions is observed, and upper limits are set on the branching ratio of the Higgs boson to LLPs. Branching ratios above 10% are excluded at 95% confidence level for LLP mean proper lifetimes $c\tau$ as small as 4 mm and as large as 100 mm. For LLP masses below 40 GeV, these results represent the most stringent constraint in this lifetime regime.


Introduction
The discovery of a new particle consistent with the Higgs boson at the Large Hadron Collider in 2012 revealed the final missing piece of the Standard Model (SM) of particle physics [1, 2]. Since then, a physics programme has emerged which aims to scrutinize the properties of the Higgs boson and the nature of the Higgs sector. Assuming that new physics does not strengthen the couplings of the Higgs boson to / bosons, current experimental constraints allow for branching ratios of the Higgs boson to visible beyond-the-SM (BSM) states of up to 21% [3,4]. Further, many theories that aim to solve the shortcomings of the Standard Model point to the Higgs boson as a possible portal to new physics [5], with exotic Higgs boson decay modes being the primary phenomenological consequence and means of discovery [6][7][8].
Decays of the Higgs boson into particles with macroscopic mean proper lifetimes ( 100 m), known as long-lived particles (LLPs), are predicted by top-down theories and are further motivated by bottom-up considerations. Examples of top-down theories include those based on the concept of Neutral Naturalness, such as Folded SUSY [9,10] and Quirky Little Higgs [11]. These models aim to solve the hierarchy problem by introducing quark partners that are not charged under the SM SU(3) C group, but instead under some new SU(3) gauge group [6]. While conventional solutions to the hierarchy problem with coloured quark partners, such as supersymmetry [12][13][14][15][16][17], are becoming increasingly constrained experimentally, the production cross section for such uncoloured quark partners is greatly reduced, easily evading experimental constraints while remaining discoverable at the LHC. If the quark partners are sufficiently heavy, pure-SU(3) glueballs (SU(3) -neutral bound states composed of SU(3) mediators) would lie at the bottom of the BSM particle mass spectrum. Glueball masses in the range 10-60 GeV are preferred theoretically, and could be produced in decays of the Higgs boson. At the bottom of the BSM mass spectrum, the only decay mode available to such glueballs would be to SM fermions through an off-shell Higgs boson (with Yukawa-ordered branching ratios), causing decays into¯to dominate with lifetimes anywhere in the micrometer to kilometer range [6].
Bottom-up scenarios, such as hidden valleys [18,19] and minimal scalar extensions to the SM, are also well motivated because the SM Higgs field admits the renormalizable portal interaction | | 2 † . These simplified scenarios often arise naturally in the infrared limit of more complete theories such as Folded SUSY [9], and provide generic examples of scenarios in which the Higgs boson decays into LLPs.
This paper considers a simplified model in which a new electrically neutral pseudoscalar boson is produced in pairs through decays of the Higgs boson, with the boson subsequently decaying exclusively into¯, as shown in Figure 1. The mean proper lifetime and mass of the boson are treated as free parameters. Aside from the interaction, the Higgs boson is assumed to be that of the SM.
Several searches for → → decays have been performed at the LHC, optimized for different regimes of . Searches based on standard -tagging techniques were performed by ATLAS, with sensitivity for 1 mm [20,21]. While these searches focused on the topology of associated ( = , ) production of the Higgs boson, where leptonic decays provide a trigger object, other searches have targeted gluon-gluon fusion production of the Higgs boson, which has a much larger cross section. For longer boson lifetimes, standard -tagging algorithms become inefficient compared to specialized reconstruction algorithms. The LHCb experiment performed a search for displaced vertices (DVs) resulting from LLP decays, obtaining sensitivity in the range 1 mm 0.1 m [22]. The CMS experiment also performed a DV-based search, yielding sensitivity in the range 1 mm 1 m for ≥ 40 GeV [23]. To identify LLP decays in the hadronic calorimeter, the ATLAS experiment selects jets that deposited anomalously more energy in the hadronic calorimeter than in the electromagnetic calorimeter (both at trigger level [24] and offline), providing sensitivity for 0.1 m 10 m [25]. Finally, ATLAS extends this sensitivity to 100 m with searches for LLPs decaying within the muon spectrometer, by requiring a cluster of muon regions-of-interest from the first-level trigger [24] and a DV reconstructed from muon spectrometer tracklets [26,27].
The search presented here targets LLP decays within the ATLAS inner-detector volume, resulting in the presence of at least two DVs that may be reconstructed from inner-detector tracks. The specialized algorithms for reconstruction of LLP decays are computationally expensive and therefore cannot be used at trigger level or in the standard ATLAS reconstruction chain [28]. As a result, the associated Higgs boson production topology is probed in this analysis, where → ℓℓ (ℓ = , ) decays provide highly efficient trigger objects. Based on the results of the standard event reconstruction, a small subset of the data is selected (as described in Section 5) for reprocessing with dedicated tracking and vertexing algorithms that can reconstruct displaced LLP decays [29,30]. Finally, events are required to contain at least two DVs with high mass and track multiplicity relative to SM processes, effectively suppressing backgrounds. The residual background is estimated using a data-driven method and validated using an orthogonal signal-free data sample. This is the first search for Higgs boson decays into LLPs to make use of this analysis methodology. Although it is assumed that the boson is neutral and decays exclusively into¯, these signal features are not explicitly exploited, so sensitivity is maintained for new physics scenarios with charged intermediate states and/or other hadronic decay modes. With this strategy, sensitivity to LLPs with 2 mm 0.2 m is obtained.

Data and simulated events
This analysis uses 139.0 fb −1 of √ = 13 TeV collision data, collected by the ATLAS experiment from 2015 to 2018. Only data collected during stable beam conditions in which all detector subsystems were operational are considered [36]. Events were selected using the lowest unprescaled single-lepton ( , ) triggers available during each data-taking year [37,38]. Additional events selected with single-photon and diphoton triggers are used for validation of the background modelling strategy.
Simulated signal and background events were used to develop the analysis, optimize the event selection criteria, measure the signal selection efficiency, and assess sources of systematic uncertainty. The effect of multiple proton-proton interactions in the same or neighbouring bunches (pile-up) was modelled by overlaying the hard-scatter process with simulated inelastic proton-proton scattering events generated with P 8.186 [39]

Signal simulation
Samples of simulated events with a Higgs boson produced in association with a leptonically-decaying boson were generated using P B v2 [44][45][46][47]. The P B prediction is accurate to next-to-leading order (NLO) for + 1-jet production. Amplitudes for QCD virtual/loop corrections were constructed through the interface to the G S package [48]. The loop-induced → process was generated separately at leading order with P B . In all cases, the PDF4LHC15nlo PDF set [49] was used. The simulated samples are normalized to SM cross sections calculated at NNLO in QCD with NLO electroweak corrections for¯→ and at NLO and next-to-leading-logarithm accuracy in QCD for → [50][51][52][53][54][55][56][57], neglecting effects from potential interactions.
The decay of the Higgs boson into two spin-0 bosons and the subsequent decay of each boson into a pair of -quarks were simulated with P 8.212 [39]. The coupling of the boson to -quarks is assumed to be that of a pseudoscalar; however, the analysis is insensitive to the parity of the boson. P 8.212 was also used for parton showering and hadronization, as well as underlying-event simulation (using the AZNLO CTEQ6L1 set of tuned parameters [58]). Masses of the boson in the range 16-55 GeV were considered, and statistically independent event samples were produced with of 10 mm, 100 mm, and 1 m for each mass. These samples were reweighted to obtain samples corresponding to alternative lifetimes, where the weights were derived from the ratio of lifetime probability density functions.
The impact of using a different parton shower model on the T of the system was evaluated by comparing the nominal , → samples with an additional sample produced with the P B v2 generator using the NNPDF3.0nnlo set of PDFs [59] and interfaced with H 7.2.1 [60,61] using the H7.1-Default set of tuned parameters [62] and the MMHT2014nnlo PDF set [63].

Background simulation
While the background modeling strategy used in this analysis is fully data driven, simulated + jets events are used to optimize the analysis selections and derive systematic uncertainties. The production of + jets was simulated with the S 2.

Event reconstruction
Tracks are reconstructed from the collection of hits in the ID using a combinatorial Kalman filter [74]. The standard ATLAS track reconstruction algorithm is optimized for reconstruction of tracks that originate in the vicinity of the IP, and is not efficient for reconstructing displaced tracks corresponding to charged particles produced in LLP decays (although it is generally insensitive to the delayed arrival time of the decay products). To reconstruct these tracks, a secondary large radius tracking (LRT) algorithm is run on a subset of the data, taking as input the hits left over from the standard tracking procedure [29]. The LRT algorithm has looser requirements than the standard track reconstruction, particularly on the transverse and longitudinal impact parameters with respect to the primary vertex ( 0 and 0 ). This algorithm is computationally expensive, but improves the overall reconstruction efficiency for hadrons produced at = 50 (300) mm from about 20% (5%) to 90% (30%) 2 . Hadrons produced with > 300 mm traverse an insufficient number of tracker layers for reconstruction.
Events are required to contain a primary vertex (PV) associated with the hard scatter. The PV must have at least two tracks, each with T > 500 MeV. If more than one PV candidate satisfies these criteria, the candidate with the largest 2 T is selected, where the sum extends over tracks associated with the vertex. Electron candidates are reconstructed by matching clusters of energy in the calorimeters to reconstructed tracks [75]. Candidates are required to be within the fiducial region | | < 2.47, and not within the calorimeter transition region (1.37 < | | < 1.52). To reduce the background from non-prompt electrons and photon conversions, electrons must pass the Medium likelihood-based identification working point [75].
Photon candidates are similarly reconstructed from clusters of energy in the calorimeters. Candidates are required to have | | < 2.47 and satisfy the Loose identification criteria [75].
Muon candidates are reconstructed from track segments in the muon spectrometer and matched to ID tracks [76,77]. A combined track is then formed using the information from both detector subsystems.
Hadronic jets are reconstructed from topological clusters [78] of energy deposits in the calorimeter, using the antit algorithm [79] with a radius parameter = 0.4. Jets are calibrated using simulated events according to the procedure described in Refs. [80][81][82][83][84], followed by a residual calibration that takes into account the differences between data and simulation, and different detector response in the central (| | < 0.8) versus forward (| | > 0.8) region through the in situ calibration techniques described in Refs. [85,86]. The jets considered in this analysis are required to have T > 20 GeV and | | < 2.5.
The DL1 -tagging algorithm quantifies the likelihood that a jet originated from a -flavour quark versus a gluon or light-flavour ( , , , ) quark [87,88]. This algorithm exploits several features of -jets, such as the lifetime, mass, and decay kinematics of -hadrons, within a deep feed-forward neural network to distinguish between these hypotheses. Large values of the DL1 discriminant correspond to -like jets.
Two additional track-based observables are used to discriminate between jets initiated by displaced partons resulting from LLP decays and those jets initiated by prompt partons. The first observable [89] exploits the fact that jets with modest displacement from the IP will contain tracks that are incompatible with all PV candidates. Each jet is assigned a vertex-fraction to test its compatibility with PV candidate , where trk T is the transverse component of the momentum of all tracks geometrically matched to the jet (using a shrinking cone algorithm) and trk∈ T is defined similarly, but considering only the subset of tracks matched to primary vertex candidate . A track is considered matched to a PV candidate if | 0 | < 0.5 mm and | 0 sin | < 0.3 mm. A standard QCD jet should have a large value of for the candidate PV at which it was produced, while jets originating from the decay of an LLP should have small values of for all PV candidates. Therefore, the maximum value for all PVs, max ≡ max ( ), is used to discriminate between signal and QCD jets, as well as to reject jets originating from pile-up interactions.
The second observable exploits the fact that jets which are more significantly displaced deposit energy in the calorimeters but contain few reconstructed tracks. Therefore, the charged hadron fraction ( ch ) is defined as

Displaced vertices
Displaced vertices are reconstructed from the combined collection of standard and LRT tracks [30]. The algorithm proceeds in several steps. First, vertex reconstruction is seeded by pairs of tracks which were preselected with a series of quality criteria, including hit multiplicity requirements and T > 1 GeV. Further, at least one track must have | 0 | > 2 mm. The compatibility of each possible pair of preselected tracks is then assessed using the impact parameters of the tracks with respect to the estimated secondary vertex position, and those deemed loosely compatible are retained. The tracks are additionally required to pass a hit-pattern requirement which ensures that the hits associated with each track are compatible with a particle originating from the position of the seed vertex. These two-track seed vertices are then combined to form multi-track vertices. Nearby vertices are then merged, and tracks, including those that failed the preselection, are ultimately attached to compatible vertices provided that 2 / dof of the updated vertex fit is less than 20. The efficiency of this vertex reconstruction is similar for light-flavour and heavy-flavour decays of the LLP, once its strong dependence on the multiplicity of charged LLP decay products is taken into account [30].
Reconstructed DVs are then pruned of tracks which are loosely associated with the vertex; only those tracks with a transverse (longitudinal) impact parameter with respect to the DV position satisfying After track pruning, vertex properties are recalculated.
A vertex preselection is applied to reject low-quality fake DVs and those resulting from interactions between high-momentum hadrons and detector material. These interactions can produce a collimated spray of secondary hadrons, mimicking an LLP decay. Displaced vertices are required to fall within the geometric acceptance of the tracker, defined as < 300 mm and | | < 300 mm. DVs are further required to have 2 / DoF < 5. Finally, a material veto is applied to reject DVs whose position is compatible with the location of known detector elements [28].
Preselected DVs are dominated by the decay products of inelastic interactions between promptly produced particles and the detector material, and the decays of SM particles with significant lifetime. As shown in Figure 2, most of these DVs are rejected by requiring that the number of tracks associated with the vertex trk is greater than two. The remaining DVs result primarily from the random crossings of unrelated tracks in the vicinity of a metastable hadron decay. Particularly in the case of large crossing angles, these random crossings lead to artificially large values of the reconstructed vertex invariant mass. To identify such vertices, tracks associated with DVs are removed one-by-one and the angular separation Δ between the momentum of the removed track and the combined momentum of the remaining vertex tracks is calculated, with the observable Δ max defined as the maximum of all such values. The reduced mass of the vertex is then calculated from the ratio of the reconstructed vertex invariant mass, , and Δ max . As shown in Figure 2, the reduced mass discriminates strongly between signal and background DVs for all values of probed. To reject vertices with random track crossings, DVs are required to have /Δ max > 3 GeV.
To reduce the background contribution from heavy-flavour hadron decays, additional cuts are placed on the significance of the vertex radial position measurement, defined as the ratio of the vertex radial position to its uncertainty ( ), and at least one track in the vertex must have | 0 | > 3 mm with respect to the PV. Finally, to facilitate the modelling of the background, a one-to-one matching between jets and DVs is performed, requiring DVs to be within Δ = 0.6 of one of the four leading jets. The full set of vertex selection criteria is shown in Table 1.

Event selection
Events consistent with the , → topology were first selected using the lowest-threshold unprescaled single-lepton ( , ) triggers operating during each data-taking year [37,38]. Events are then identified for reprocessing with the LRT and displaced vertex reconstruction algorithms using an additional selection, referred to as a filter, which makes use of the results of the standard ATLAS reconstruction algorithms to identify jets consistent with originating from an LLP decay. The selection requires the presence of at least one jet with T > 20 GeV, where at this stage the full jet calibration is not yet applied and the jet energy is calibrated at the electromagnetic scale [83]. At least one of the two leading, central (| | < 2.1) jets must satisfy ch < 0.045 or max < 0.05. This selection was optimized to provide high signal efficiency for a wide range of LLP proper lifetimes while maximizing background rejection. Finally, the filter requires the presence of a lepton from the boson decay. The number of events passing this filter is sufficiently small to permit reconstruction with the computationally expensive algorithms for displaced tracks and vertices. A second filter is also used, which selects events passing a high-T photon trigger, to facilitate the validation of the background estimation method.
After reconstruction with the LRT algorithm, the signal and control regions are defined by requiring events to contain two opposite-sign, same-flavour leptons. The leading (sub-leading) lepton is required to have T > 27 (10) GeV. To select events with a reconstructed boson, the dilepton invariant mass ℓℓ is required to be in the range 66 < ℓℓ < 116 GeV. Additionally, events are required to contain at least two jets. The trigger, filter, lepton, ℓℓ , and jet requirements comprise the event preselection. The signal region (SR) is then defined by requiring that events pass the preselection and contain at least two DVs passing the selections described in Section 4.1. To ensure robust modelling of the background, the DVs are required to be associated with different jets. Depending on the value of , these criteria select signal events with a leptonically-decaying boson with an efficiency ranging from 0.4% to 0.6% (0.04% to 0.2%) for = 10 (100) mm. A control region (CR) is defined similarly to the SR except for inversion of the DV multiplicity requirement, and is used to derive the background estimate. Despite the relatively low efficiency for signal events to satisfy the SR selection, signal contamination in the CR is negligible.
The background estimate is validated using an orthogonal +jet event selection. This validation region (VR) requires events to pass a high-T photon trigger, contain at least one photon with T > 160 GeV or two photons with T > 60 GeV, and at least two jets. Additionally, events containing charged leptons are vetoed in the VR to remove potential biases from signal.

Background estimation
The probability for a jet to be matched to a DV within Δ = 0.6 is measured in the CR by taking the ratio of the number of jets matched to a DV to the total number of jets. This per-jet DV probability is observed to have a non-trivial dependence on the properties of the jet. The track multiplicity and density within a jet increase with the jet T [90], causing this probability to increase correspondingly. Due to the lifetime of hadrons containing heavy-flavour quarks, the probability also depends strongly on the flavour of the parton that initiated the jet. Therefore, the per-jet DV probability DV ( ) is parameterized as a function of jet T and DL1 -tagging discriminant, as shown in Figure 3.
By restricting the computation of the per-jet DV probability to a region requiring fewer than two DVs, a bias of the order of 2 DV is introduced. However, due to the fact that DV 1, these higher-order corrections are negligible.
The probability that an event contains a given number of DVs among the four leading jets may then be computed from a multinomial distribution. For example, the probability that an event contains exactly one DV is given by:  10 [GeV] If the event contains fewer than four jets, then the probability is computed from all jets in the event. The probability for an event to fall in the SR is given by ≥2 = 1 − 0 − 1 . Therefore, the background estimate is obtained by weighting preselected data events by ≥2 .
The presence of signal contamination in preselected data would increase the per-jet DV probability, as well as the population of events to which this probability is applied. Both of these effects would artificially increase the background estimate, potentially concealing the presence of signal in the SR. This effect was studied with a conservative assumption about the branching ratio B ( → ) and found to have less than a 10% effect on the background prediction, negligible in comparison to the size of injected signal.
To compute the uncertainty in the background estimate, pseudo-experiments are performed. Each pseudoexperiement is assigned a unique set of per-jet probabilities by varying each ( T , DL1) bin within statistical uncertainties, and the varied probabilities are used to predict the background yield. The mean of the pseudo-experiments determines the central value of the expected yield, with its statistical uncertainty given by the RMS of pseudo-experiments. Using this method, a background estimate of 1.30 ± 0.08 events is obtained, where the uncertainty is statistical only.
This method for computing the background prediction relies on the assumption that the per-jet DV probabilities are unaffected by the presence of multiple DVs in an event. To validate this assumption, the VR defined in Section 5 is used. The parameterization of the per-jet DV probabilities in terms of T and DL1 allows for a robust validation of the method despite the kinematic properties of the jets in this region being different from those in the SR. By studying the predicted and observed numbers of events with at least two DVs in the VR, the core assumption of the background estimate may be tested and any systematic differences may be taken into account as an additional uncertainty in the background estimate.
Following the same procedure as is used in the CR, the per-jet DV probability is calculated using events in the VR with fewer than two DVs, and used to predict the number of events with at least two DVs. Using this method, a mean of 19.9 ± 0.4 events are predicted and 23 are observed, as shown in Figure 4, demonstrating that any jet-jet correlations are negligible or captured by the parameterization in terms of jet T and DL1.
The systematic uncertainty of the method used to obtain the background prediction is derived by comparing the numbers of predicted and observed events in the VR. Although the prediction and observation agree within uncertainties, not many events are available to test the background estimation method. Consequently, a conservative systematic uncertainty, equal to the statistical uncertainty of the observed number of events in the VR, is assigned to the total expected number of background events in the SR. This amounts to a 21% uncertainty, or ±0.27 events. This 21% uncertainty also covers the sensitivity of the background estimate to possible signal contamination impacting the per-jet DV probabilities, as discussed above.
Additional studies were performed by considering the subset of VR events that contain a jet satisfying the max and/or ch requirements described in Section 5. This requirement reduces the statistical power of the VR, but makes it more representative of the SR. With the addition of this requirement, the per-jet DV probabilities are compatible with those derived in the control region, and the predicted number of events in the VR is in good agreement with the observed data.

Signal systematic uncertainties
While the analysis is dominated by the statistical uncertainty of the observed data yield, several sources of experimental and theoretical uncertainty are considered that could potentially affect the signal efficiency.
Experimental uncertainties originating from differences between data and simulation are considered for all physics objects used in the analysis. For standard objects, these include uncertainties in the reconstruction and identification efficiencies, as well as the energy calibrations. Efficiency scale factors are used to correct the modelling of electrons and muons in simulation relative to data. These comprise electron identification and muon reconstruction scale factors as well as the scale factors applied to correct for trigger efficiency differences between data and simulation [37, 38, 75, 77]. The associated uncertainties are computed by varying the scale factors up and down by one standard deviation and evaluating the impact on the final expected number of signal events. For jets, jet energy scale (JES) and jet energy resolution (JER) uncertainties are considered [91]. The applicability of the standard jet calibration scheme to the displaced jets considered in this analysis has been studied and found to be satisfactory [92]. Simulated events are reweighted such that the distribution of the average number of interactions per bunch crossing matches the distribution measured in data. Since the number of interactions per bunch crossing depends on the inelastic cross section, the difference between the predicted and measured cross sections [93] is propagated to the simulation by a systematic variation of the reweighted distribution.
The dominant systematic uncertainty in the expected signal yield originates from the non-standard reconstruction methods used. The uncertainty in the reconstruction of displaced vertices comes from the difference in performance of both the standard and LRT algorithms between data and simulation. For standard tracking, this uncertainty is known to be around 2% for tracks with small displacements from the IP [94]. To assess the uncertainty in the vertex reconstruction efficiency due to the modeling in the LRT algorithm, the rates of displaced vertices consistent with 0 S → + − decays are compared between data and +jets simulation. The uncertainty is estimated by examining the difference in 0 S yield between data and simulation. From the preselected events, candidate 0 S vertices are identified by requiring that the vertices pass the vertex preselection, have exactly two tracks of opposite charge, and have an invariant mass in the region 450 to 550 MeV. The simulation is normalized such that the number of 0 S vertices inside the beam pipe ( < 23.5 mm) is the same in data and simulation. This accounts for any 0 S decay yield differences that may exist between data and simulation in a region where tracking uncertainties are well understood. The T distributions of tracks from the 0 S vertices were found to be consistent between data and simulation. The vertex yields of 0 S are then compared between data and the normalized simulation in five bins of ranging from zero to the radius of the last pixel layer (122 mm), beyond which there are significantly fewer vertices. The ratio quantifies the discrepancy between data and simulation in the number of 0 S vertices reconstructed from tracks outside of the beam pipe. To compute a per-track uncertainty, the square root of the ratio is taken because 0 S vertices are two-track vertices. This uncertainty is then added in quadrature with the 2% uncertainty from the standard tracking pass.
To propagate this uncertainty to the signal yield, tracks are randomly removed from reconstructed vertices in an uncorrelated fashion 3 with a probability given by the per-track uncertainty corresponding to the position of the vertex. The tracks are removed following the same track-pruning procedure as described in Section 4.1, and the vertex properties are recalculated from the updated set of associated tracks. The yield of vertices passing the full signal vertex selection is then compared between the modified and nominal vertex collections as a function of , and the ratio of yields provides a scale factor corresponding to the reconstruction efficiency of each LLP. A per-event efficiency correction scale factor is obtained from the product of the scale factors corresponding to each LLP. The difference between the efficiency computed with the scale factors applied and the nominal selection efficiency is taken as the systematic uncertainty. This uncertainty is largest for small values of and increases with the proper lifetime to a maximum value of 12%.
Uncertainties associated with the displaced-jet filter are considered. The dominant contribution to the uncertainty originates from the use of uncalibrated jets in the filter. This uncertainty is estimated by raising the jet T threshold in the filter by 5 GeV, from 20 to 25 GeV, and calculating the change in filter efficiency on simulated signal samples. The choice of 5 GeV is motivated by studying the difference in efficiency of the uncalibrated jet T selection between data and simulation as a function of calibrated jet T .
A theoretical uncertainty of ±4% and ±25% is assumed for the total¯→ and → , → cross section, respectively [50]. These uncertainties include effects from varying the factorization and renormalization scales, the PDF and s . To determine the uncertainty due to the choice of parton shower model, the nominal P 8 samples are compared with a reference H 7 sample of production. The T of the system is compared between the two samples and used to reweight events in the P signal samples to the reference H sample. The change in acceptance after reweighting is found to be of the order of 0.5% and is included as an additional systematic uncertainty in the signal modelling.
Finally, the expected number of signal events is subject to the uncertainty in the total integrated luminosity of the data sample used. The uncertainty in the combined 2015-2018 integrated luminosity is 1.7% [95], obtained using the LUCID-2 detector [96] for the primary luminosity measurements.
A summary of the systematic uncertainties is given in Table 2. With the exception of the uncertainties from the LRT algorithm and the displaced jet filter, the systematic uncertainties are not observed to vary significantly with or and are thus computed from the weighted average across samples. With the exception of the uncertainties from the filter and the LRT algorithm, no significant dependence on or is observed, so the quoted values are those derived from averaging over the different masses and lifetimes. The uncertainty from the filter efficiency was found to be uncorrelated with but to increase with , so the range quoted corresponds to the observed variation as increases, after averaging over . The uncertainty from the LRT algorithm was found to be correlated with both and , so the quoted range corresponds to the observed variation with and (with the uncertainty increasing most strongly with ).

Results
The mean number of background events predicted in the SR is 1.30 ± 0.08 (stat.) ± 0.27 (syst.). Zero events are observed in the signal region, compatible with the background prediction at the 68% confidence level (CL), as shown in Figure 5. Upper limits at the 95% CL are set on the branching ratio of the Higgs boson to pairs of long-lived pseudoscalars B ( → →¯¯) for each signal mass hypotheses, following the CL s prescription [97] with a profile likelihood-ratio test statistic. A Poisson probability term describing the total number of observed events is used, with the systematic uncertainties of the signal and background yields treated as nuisance parameters and assigned Gaussian constraints with the appropriate widths, as described in Section 7. Pseudo-experiments which sample the distribution of the profile likelihood ratio are generated to compute the -value and derive the exclusion limits. The observed limits for all signal mass points as a function of are shown in Figure 6.
The limits are strongest for ≈ 10-20 mm, probing values of B ( → →¯¯) below ∼ 4%, and become weaker for shorter/longer lifetimes, primarily due to reduced radial acceptance. The limits depend only weakly on the boson mass, due to the relative insensitivity of the reduced-mass observable to . For = 16 (55) GeV, a branching ratio greater than 10% is excluded at the 95% CL for 3.7 < < 37 mm (5.4 < < 102 mm), probing an important gap in the ATLAS exotic Higgs decay programme. Decays of the Higgs boson into LLPs with shorter lifetimes are probed by conventional searches using standard -tagging algorithms to identify signal events [20,21], while decays into LLPs with longer lifetimes are probed by calorimeter-and MS-based searches utilizing dedicated LLP triggers [25][26][27].

Conclusion
A search for exotic decays of the Higgs boson into pairs of long-lived pseudoscalars ( ) is presented, using 139 fb −1 of √ = 13 TeV collision data collected between 2015 and 2018 with the ATLAS experiment at the LHC. The search is performed in the topology of associated production. Displaced vertices with high mass and track multiplicity relative to ones from SM processes are exploited as a signature of boson decays and as a means to reject backgrounds. A data-driven method is used to assess the expected number of background events satisfying the event selection. The number of observed events is consistent with this prediction, and limits on the branching ratio B ( → →¯¯) are calculated at the 95% CL. For masses below 40 GeV, these are the most stringent limits to date in the probed lifetime regime.