Emerging jet probes of strongly interacting dark sectors

A strongly interacting dark sector can give rise to a class of signatures dubbed dark showers, where in analogy to the strong sector in the Standard Model, the dark sector undergoes its own showering and hadronization, before decaying into Standard Model final states. When the typical decay lengths of the dark sector mesons are larger than a few centimeters (and no larger than a few meters) they give rise to the striking signature of emerging jets, characterized by a large multiplicity of displaced vertices. In this article we consider the general reinterpretation of the CMS search for emerging jets plus prompt jets into arbitrary new physics scenarios giving rise to emerging jets. More concretely, we consider the cases where the SM Higgs mediates between the dark sector and the SM, for several benchmark decay scenarios. Our procedure is validated employing the same model than the CMS emerging jet search. We find that emerging jets can be the leading probe in regions of parameter space, in particular when considering the so-called gluon-portal and dark photon-portal decay benchmarks. With the current 16.1 fb$^{-1}$ of luminosity this search can exclude down to ${\cal O} (20) \% $ exotic branching ratio of the SM Higgs, but a naive extrapolation to the 139 fb$^{-1}$ luminosity employed in the current model-independent, indirect bound of 16% would probe exotic branching ratios into dark quarks down to below 10%. Further extrapolating these results to the HL-LHC, we find that one can pin down exotic branching ratio values of 1%, which is below the HL-LHC expectations of 2.5$-$4%. We make our recasting code publicly available, as part of the LLP Recasting Repository.


Introduction
Extensions of the Standard Model with a strongly coupled, non-Abelian dark sector have received considerable attention in the recent years.The phenomenological possibilities are varied, due the large number of parameters, such as the gauge group dimension (number of dark colors), the matter field content (number of dark flavors), the mass hierarchies in the dark sector, and the coupling strengths between the dark sector and the Standard model, as well as the internal dark sector couplings.Considering a collider operating at a center of mass energy √ s, the subclass of models which confine at a scale Λ D , and where the dark sector masses m D ≲ Λ D ≪ √ s give rise to a class of signatures generically dubbed dark showers, in analogy to the familiar parton shower in the strong sector of the Standard Model, see [1] for a comprehensive review of the current phenomenological, experimental and theoretical status.Dark showers give rise to uncommon, exotic collider objects, such as trackless jets [2], emerging jets [3], semi-visible jets [4], dark jets [5] and SUEPs [6].An experimental program aiming at dark shower signatures at the LHC is already underway [7][8][9][10].
It is customary [11] to dissect the collider phenomenology of dark showers into three pieces: i) production, ii) showering (which actually includes hadronization) and iii) decay.The production part consists of the parton-level production of dark quarks (customarily through 2 → 2 processes), the showering phase includes both the emission of dark gluons and dark quarks and the formation of bound states (dark hadrons), akin to the known behaviour of the strong sector of the Standard model.Indeed, the showering process yields a distinctive signal, as, unlike in normal searches for new phenomena, one is not targeting one or two new particles, but in principle many of them.Hence, the large multiplicity inherent to the dark parton showering is what makes them stand out from other Beyond Standard Model (BSM) searches.The decay modes of these dark hadrons (including SM particles such as leptons, quarks, gluons, or potentially not decaying at all contributing to the overall dark matter abundance) combined with the lifetime spectra of these dark hadrons span a large number of phenomenologically distinct scenarios.Decay benchmarks to guide the experimental exploration have been recently put forward in [12,13].
If the dark sector contains some particles that are long-lived (another research direction that has received considerable attention in the recent past, see e.g.[11,13,14] for reviews), with lifetimes in the few mm to few meters, which decay into SM particles, they give rise to emerging jets (EJ).In this work we present a simple and flexible reinterpretation of the CMS search for emerging jets [7] seizing all publicly available information.We validate our procedure by reproducing the CMS results for their benchmark model, proposed in [3].The software developed for the reinterpretation procedure has been uploaded to the LLP Recasting Repository [15], which we expect to be useful for those interested in a straightforward reinterpretation of this study.We later apply our procedure to the concrete case of exotic Higgs decays, namely we obtain bounds on the branching ratio of the SM-like 125 GeV Higgs boson into a pair of dark quarks.We also show that the bounds arising from the reinterpretation of the emerging jet search, albeit not designed to target this scenario, can nonetheless provide leading constraints in large portions of the parameter space.
This article is organized as follows.In section 2 we briefly review the phenomenology of emerging jets and the t-channel models from [3] which give rise to them.In section 3, we show our validation of the CMS emerging jet search.In section 4 we apply our recasting procedure to a series of BSM decay benchmarks devised in [12].We reserve section 5 for conclusions.Technical details about the validation of the CMS Emerging Jet is described in Appendix A.

Emerging jet phenomenology
When considering a strongly interacting dark sector, the known behaviour of QCD provides a guidance for the relevant phenomenological features of the model.A QCD-like sector with gauge group SU (N C D ) and N f D degenerate Dirac fermions (dark quarks, q D ) would then exhibit asymptotic freedom if N f D < 4N C D , and confine at a scale Λ D , where m q D ≲ Λ D .When these particles are produced at a collider with a center-of-mass energy √ s < Λ D1 , the dark quarks hadronize into dark hadrons (π D , ρ D , ω D , ...) which are clustered into collimated dark jets.The resulting signatures will then depend on the dark hadron decay, more specifically on the lifetime spectrum cτ D , but the main underlying theme is that the shower process can lead to a large multiplicity of objects, while in traditional searches only one or two new particles are targeted.
Following [11], the dark shower can be decomposed into the three parts described above: production, hadronization (in the dark sector) and dark hadron decays.For the production at a collider, is necessary to connect the Standard Model to the dark sector, which is done through a portal.For emerging jets, the focus of this work, the model proposed in [3,19] employs a a bi-fundamental scalar X (namely, charged under both QCD and the dark sector) which interacts with a SM down-type quark and a dark quark via the following Lagrangian L ⊃ −κ ij qR i q D j X . (2.1) While in principle κ is a 3 × N C D matrix, we consider here the case where there is one single universal coupling to the right-handed down type quark, to avoid bounds from flavour physics (FCNCs, neutral meson-mixing, rare decays).In this model, X is pair produced with a sizable rate through gauge interactions, and the subsequent X → d R q D decay (which happens with 100 % branching ratio) leads us to expect two light jets and two emerging jets.Since the mediator particle X appears in t-channel exchanges, the model of equation 2.1 is colloquially referred to as a t-channel model.
In this paper we will also study the production via an s-channel SM Higgs boson h, which falls in the category of exotic Higgs decays [20,21].Measurements of the Standard Model Higgs properties set Br(h → exotic < 0.16) at 95 % C.L. from both ATLAS [22] and CMS [23] with a total integrated luminosity of 139 and 138 fb −1 respectively.These bounds are not completely model independent, as they assume that the Higgs boson couples to the electroweak gauge bosons with a strength equal or less to the SM one, which can be violated in certain BSM scenarios.The combination of ATLAS and CMS with 3000 fb −1 is expected to set a limit of 2.5%, under the assumption that the current systematic uncertainties would be halved [24], or 4% with the current systematic uncertainties [25].This leaves ample room for the h → q D q D to occur with sizable rates.We note that the identification of a resonant, long-lived q D q D topology using Machine Learning techniques has been recently studied in reference [26] for the SM Higgs and in reference [27] for a TeV scale Z'.
The showering and hadronization in the dark sector is conducted through the Hidden Valley module [28,29] within Pythia8 [30].The non-perturbative nature of the dark QCD-like theories prevents from consistently connecting UV and IR parameters based on a perturbative approach, and hence it is customary to consider a dark sector consisting of spin-1 dark vector mesons ρ D , and of spin-0 pseudoscalar dark pions, π D .As the π D arise from a breaking of a chiral dark symmetry, they are parametrically lighter than the other mesons in the theory, which decay into dark pions if kinematically allowed.Hence, the phenomenology is dictated by the dark pion properties, in particular their lifetime cτ π D .We distinguish three possible regimes depending on the dark pion lifetime.If the dark pions decay promptly (cτ π D ≲ 1 mm), they end up giving multi-jet signals 2 .If, on the contrary, the dark pions are stable in the detector (cτ π D ≳ 1 m) then they appear as missing energy, and can be targeted by the suite of missing energy signatures that are customarily searches in the dark matter program at the LHC [31].For cτ π D ∈ [0.001 − 1] m, the dark pions decay inside the detector volume with different decay lengths, depending on their boost and on the fact that their actual decay position is sampled from an exponential distribution.
The decay patterns of the dark pions can be quite varied.As mentioned before, in reference [3] to avoid dealing with non-trivial bounds from flavour processes, a 100 % decay rate to right-handed down quarks was assumed.Yet, the possibilities for the decay are quite numerous, and in that light reference [12] proposed five decay benchmark models, dubbed decay portals, based on a minimal set of theoretical priors.These decay portals describe how the pseudoscalar and vector dark mesons decay into Standard Model particles 3 .If π D decays into gluons (photons) through a dimension 5 operator we have the gluon (photon) portal.If π D instead couples to the Standard Model Higgs via the H † H operator, then we have the Higgs portal, where the decays to the SM quarks follow the Yukawa hierarchy of a SM-like Higgs with m H = m π D .The other two decay portals have the π D decaying either through its mixing with the photon (akin to the γ-ρ mixing in the Standard Model), or through the chiral anomaly into a pair of dark photons A ′ , inspired by the π 0 → γγ SM process.The corresponding Pythia configuration cards for each of these portals can be generated through the public python script [32].
The number of free parameters in both our scenarios is still quite large, and here we follow some additional choices made in the literature.Regarding the t-channel EJ model, N C D = 3 and N f D = 7 are inspired by the study of [33], and the dark sector mass parameters are chosen in proportion Λ D : m q D : m ρ D : m π D being 2 : 2 : 4 : 1.This choice ensures that the vector meson always decays into two dark pions, and was followed by the CMS collaboration in their emerging jets search [7] 4 .The free parameter for the analysis are then m X , m π D and cτ π D .Regarding the s-channel SM Higgs production with its different decay portals we assume N C D = 3 and N f D = 1, and Λ D : m q D : m ρ D : m π D is now 2.5 : 0.4 : 2.5 : 1.Since the mediator mass is known, the only free parameters of the model are m π D , cτ π D and the exotic branching ratio, H → q D q D .We note that while in reference [12] a minimum proper lifetime as a function of the mass for each decay portal was estimated from theoretical considerations, we prefer to remove those prejudices and consider the three parameters as fully independent and uncorrelated.
A few details about our simulations are in order.Within the Pythia Hidden Valley module, we set the parameter HiddenValley:pTminFSR to 1.1 Λ, and the probVector flag is set to 0.318, following the considerations discussed in Appendix A of [12].In section 3 we employ Pythia version 8.212, used in the CMS study, as our aim is to reproduce the published limits.For section 4 we employ instead Pythia 8.307, because (as explained in [1]) this version has corrected a previous flaw in the code, that tended to overproduce hidden hadrons at very low p T .
In this section we discuss in detail the validation of the CMS search for emerging jets using a total integrated luminosity of 16.1 fb −1 [7].
The CMS collaboration targets the dark QCD model from Equation 2.1, via pp → XX followed by X → q q D .Hence naively one expect to find two emerging jets and two SM jets.Events are selected by passing the H T > 900 GeV trigger, where H T is the scalar sum of the transverse momenta of all hadronic jets in the event, clustered with R = 0.4 using the antik T algorithm [35] applied to all tracks with p T > 1 GeV.These events are required to have at least four jets within |η| < 2.0, and they undergo a further selection using kinematical variables to tag these jets as "emerging" and to define signal regions (called sets in the CMS paper).The explicit requirements are collected in Appendix A, together with the 95 % C.L. limit to the number of signal events in each selection set, S i 95 .The total number of signal events in each set can be computed as where L is the total integrated luminosity, A i is the acceptance for the i−th set number5 and the production rate σ(pp → XX → qq D qq D ) has been decomposed into the pair production cross section for X pairs (which proceeds through gauge couplings and hence is independent of m π D and cτ π D ) times a branching ratio of X → qq D , which we set to unity along this work6 .To benchmark the search, CMS considered the following parameters: • m X [GeV]={400,600,800,1000,1250,1500,2000} .
For the m π D = 5 GeV case, CMS has provided the acceptances A i in the cτ π D − m X plane, indicating which of the seven selection numbers is the most sensitive one.In other words, for each cτ π D − m X point scanned, with m π D = 5 GeV, only one of the possible seven A i functions is given.
To validate the search, we proceed in three steps.First, we check that using the provided A i efficiencies we can reproduce the published 95 % C.L exclusion limits.In a second step, we check the degree of accuracy that we obtain for the two published kinematic distributions for the emerging jet tagging variables, and finally we show that we can reproduce with reasonable accuracy the said efficiencies and exclusion limits.The last step is crucial for our analysis, since this is what allows for reinterpretation, i.e.: derive limits from a experimental search on a model that has not been targeted by the experimental collaboration.In what follows, we define "exclusion" by requiring that the ratio of our predicted number of events from equation 3.1 over the excluded one, is equal to unity, which is a common practice when performing reinterpretations [36].

Exclusion using published efficiencies
We start by comparing the published exclusion limit with those that can be derived from using the A i map from [7].We present our results in figure 1, where the published CMS exclusion is shown as solid black.For our results, we need to provide a production cross section for the pp → XX process.On one hand, we use the cross section reported during the run of Pythia 8, which is a leading-order (LO) result in the strong QCD coupling α S , shown in green.Second, we employ the cross sections used by CMS from [37] corresponding to down-type squark pair production, computed at the next-to-leading order in perturbation theory, and including a next-to-leading logarithmic correction from soft gluon resummation [38], which is displayed in blue.We conclude that the provided A i values are self-consistent, and we also verify that the limits were derived using NLO cross sections.

Kinematic distributions
As a second step of our validation, we will employ the published kinematic distributions.
To tag the jets passing the selection as emerging, the following track-based variables [39] are considered: • ⟨IP 2D ⟩: the median of the unsigned transverse impact parameter.
• P U dz : distance between the z position of the primary vertex (PV), z PV and the z position of the track at its closest approach to the PV.A SM jet and a dark pion π D originates from (r, z) = (0, 0); the latter decays at (r tr , z tr ) into three tracks.We illustrate here how the ⟨IP 2D ⟩ variable is obtained from the transverse impact parameter of the individual tracks.See main text for details.
• D N : the 3-D distance between the track and the primary vertex, weighted by the inverse resolution, • α 3D : the ratio between the scalar p T sum of all tracks with D N < (certain value), normalized by the scalar p T sum of all tracks, hence 0 ≤ α 3D ≤ 1.
We illustrate the ⟨IP 2D ⟩ variable in figure 2. From the primary vertex located at (z, r) = (0, 0) we consider that only a SM jet (which tags the vertex) and a long-lived dark pion emerge.The dark pion decays at (z tr , r tr ) into three tracks 1, 2, 3, which for illustration purposes we consider as giving rise to only one jet.The track trajectories meet at the decay vertex: the tracks are drawn in black, and their prolongations in grey.We indicate with d i the closest distance between the i−th track and the primary vertex, hence the r(z) component gives the transverse (longitudinal) impact parameter.In the figure, the median of the (d i ) r corresponds to (d 2 ) r , hence the jet originating from the dark pion has Of the above variables, CMS presents results for ⟨IP 2D ⟩ and α 3D , before the selection cuts (which we describe below).Regarding the additional two variables, the variable D N should be small for tracks originating from prompt particles, and large for displaced tracks; while P U dz is used for pile-up rejection.We note that both D N and P U dz enter in these distributions through the definition of α 3D , which depends on D N .Since CMS has not made explicit the D N threshold employed in their figure 3, we have considered the three values employed to define the signal regions: 4, 10 and 20.Out of them, we have verified that the agreement is maximized for D N = 10, and hence D N < 10 has been employed in the shown α 3D calculation shown below.As these last two variables are defined at the track level instead of at the jets level, we understand that kinematic distributions are not provided, which nonetheless would have provided an additional validation check for the proper reinterpretation of the search.
Two important effects ought to be included for a realistic attempt at the reproduction of the CMS results: the tracking reconstruction efficiency ϵ trk and the smearing of the impact parameters.
CMS reports the tracking efficiency dependence in terms of the p T , η, and transverse vertex position (r) of the track [40].From figure 8 of this article, one can see that above p T > 1 GeV one can consider the tracking efficiency independent of p T and to a lesser extent, of η.Regarding r, the efficiency diminishes with the displacement distance, as can be seen from figure 12 of [40], which is obtained from a t t sample at √ s = 7 TeV.The figure shows the cumulative efficiency for each of the iterations (0-5) of the tracking algorithm.While this effect is less relevant for lifetimes of few millimeters, it has an impact for the benchmark point with cτ D = 25 mm and for larger lifetimes.
Nonetheless, it is clear that since the tracking efficiency is a very complicated function that can only be reliable obtained from having access to the full detector simulation (and detector information) we will pursue four different parametrizations for ϵ trk (r) • Use the reported value of Iteration 5 from figure 12  • Consider that tracks with at least one hit in the inner detector are reconstructed with 100 % efficiency, and with 0 % if not: • Consider ϵ trk (r) = 1, to illustrate the typical deviation obtained when no efficiency is considered. [A] Regarding the impact parameter smearing, we note that for jets originating from SM quarks, one can expect to have ⟨IP 2D ⟩ = 0, if the majority of the tracks of the light jet are prompt.However, the value of zero is obviously fictitious once the transverse impact parameter has been smeared to account for reconstruction effects.While the smearing functions have a non-trivial dependence with the η and p T of the corresponding track (the resolution σ r , which we have taken from figures 14a and 15a of [40]), the typical resolution would be of about 50 µm, and hence the ⟨IP 2D ⟩ variable would peak around this value for SM light-quark jets.
We show our results for the ⟨IP 2D ⟩ and α 3D variables in Figure 3, where we present the results for A as dashed red, and for It 5 , It 4 and R in solid blue, orange and green.From the left panel we see that the naive A approach does not describe the distribution as well as any of the other criteria, while the three other curves fit the signal distribution with reasonable accuracy.Moreover, we also see that the proper inclusion of the transverse impact parameter smearing is necessary to explain the distribution of ⟨IP 2D ⟩ for the QCD jets from the signal, which is displayed as dashed purple.
From the right panel we see that the α 3D distribution for the signal does not change much with the different criteria It 5 , It 4 , R and A. Since on this variable one only applies mm, yet a similar level of agreement between the CMS data and our simulation was found for all benchmark points from [7].We display results for the tracking efficiency parametrizations It 5 (solid blue), It 4 (solid orange), R (solid green) and A (dashed red).In the left panel, we include as well the results for the QCD jets coming from the X decays in dashed purple.a α 3D < 0.25 cut (see Appendix A) our attention is only in the proper reproduction of the first bins, and the mismatch at the tails is not relevant for us.Hence we delay the final judgement of which parametrization of the tracking efficiency to use to the next step of our validation: to reproduce the published A i efficiencies.
The analysis defined eight different jet identification criteria on the four relevant variables to consider a jet as emerging.These criteria are supplemented by the requirement to have a minimum of two EJs , or one EJ jet with large transverse missing energy (MET), and by additional cuts on H T and on the p T of the four hardest jets.The combination of the EJ criteria and the additional cuts define seven selection sets.The explicit requirements are collected in Appendix A, together with the 95 % C.L. limit to the number of signal events in each selection set, S i 95 .

Reproducing efficiencies and exclusion limits
If our interest would be to perform a reinterpretation of the emerging jets results in the context of the same model used by the collaboration (or one with a similar topology) then we could employ the reported acceptances A i to derive the published limits, as we did in Section 3.1.We stress that our goal is to perform a flexible reinterpretation of this search, namely to employ it to derive limits on a model that the search has not considered.
Hence, what we need is to fully validate our pipeline to compute the acceptance of the selection sets for the benchmark model used in the CMS study.We show in Figure 4 the ratio of our computed acceptances over the published CMS results in the left panels (the color bar indicate the A i value from CMS) and the obtained exclusion limits in the right panels, where we have employed the It 5 (upper row), It 4 (middle row) and R (lower row) parametrization of the tracking efficiencies.We can see that the best agreement is obtained with the R parametrization, while the other two tend to overestimate the efficiencies.We see that we have agreement up to 20-30 % for large masses in the iteration R, which degrades for lower masses and also extreme lifetime values, where the overall acceptances are nonetheless at the per-mille level or lower.We also note that the R parametrization also gives an acceptable exclusion limit, and hence we decide to adopt it for the rest of the article.We note that with more examples provided by CMS (or simply by providing the efficiencies in all signal regions) one could attempt a more complex parametrization of the efficiency.
We consider hence this search as validated, and will proceed in the next section to derive bounds on the parameter space of Exotic Higgs decays.Our analysis code that allows us to derive the exclusions have been uploaded to the LLP Recasting Repository [15], making it publicly available to facilitate the reinterpretation of the emerging jets search for arbitrary models.Further instructions and the relevant documentation to run the code can be found in the Repository.

Reinterpretation for Higgs mediated dark showers
When the SM Higgs h couples to the dark quarks the expected number of signal events reads where now the only free physical parameters are the dark pion mass and its lifetime, and the exotic Higgs branching ratio into dark quarks.
It is worth noting here that in Higgs resonant production the events would have a center of mass energy of the Higgs mass (approximately 125 GeV) while in the reinterpretation procedure from section 3 the process of pair production of X has a lowest mass of about 400 GeV.In the region of low masses our efficiencies overestimate the CMS result by a factor of a few.While the efficiency in those regions is quite low ( O(10 −3 ) and their uncertainty would depend on the Monte-Carlo statistics used in those benchmark points, it is also true that without any additional information in that region (e.g.kinematic distributions like those given for the m X = 1 TeV benchmark) we can not investigate the origin of this discrepancy.Having expressed our reservations about the accuracy of our sensitivity estimates, we proceed with our analysis, taking the results cum grano salis, and knowing that only a full-fledged experimental analysis can derive robust bounds.
To further define our framework, we need to select a decay portal for our dark mesons.We follow here the proposal of reference [12] and we consider the gluon, vector, Higgs and dark-photon portals. 7We have verified our implementation of these decay portal benchmarks by reproducing the dark meson multiplicities from reference [12]. 8We start by analyzing the acceptance A i as a function of the dark pion lifetime and masses, for the gluon decay portal, for all five considered production Higgs mechanisms, which we show in figure 5.It is worth mentioning here that we do expect the EJ search not to be optimal mX [GeV]       In the right panel we show several exclusion limits: the CMS published one (solid black), the one obtained using A i CMS efficiencies with NLO predictions from figure 1 (dashed green), as well as those derived using our A i with LO (dashed pink) and NLO (solid pink) cross sections.Both It 4 and It 5 tend to overestimate the acceptance (hence the exclusion limit), while the R criteria reproduces better the exclusion curves.The disagreement is larger for the regions of lower masses and/or large lifetime, where the A i are below the per-mille level.
for Higgs decays into dark quarks, as it originally targets two quarks and two dark quarks, while the Higgs decay would only give at parton level two dark quarks.However, since we are keeping the ρ D → π D π D channel open, and since there is additional radiation from the initial state gluons and from the decay portals themselves, we do still obtain acceptances on the 10 −4 range, which can suffice to obtain an exclusion given that with 16.1 fb −1 , O(10 6 ) Higgs bosons would be produced at the 13 TeV LHC via gluon-fusion.We illustrate the difference in kinematics between both models in Appendix B, where we also explore in detail the impact of the event selection on the different Higgs production mechanisms.From the figure we can see that the dependence in cτ π D is non trivial, obtaining a maximum around 10 mm, while for m π D the dependence is quite flat, except for the heavier masses of 20-30 GeV: those dark pions obtain a reduced boost from the Higgs compared to lighter ones.It is intriguing to see that, owing to the additional radiation, the ttH production has a higher acceptance, about an order of magnitude larger than gluon fusion, and about a factor of five larger than associated production with a vector boson.We note that vector fusion has the lowest acceptance, and this is due to the fact that the additional radiation in VBF goes in the forward direction, while the EJ analysis focuses on central jets.We stress that while we only show the gluon portal decay benchmark here, all the other portal decay models show an analogous behaviour.
The picture changes slightly once the production cross sections for each mechanism is considered, which is shown in figure 6.Here we multiply the maximum acceptance with the production cross section for each mechanism, and the total luminosity of the emerging jet search (16.1 fb −1 ).Hence the y-axis directly displays the expected number of events for each production mode.We have added here the overall number of events obtained by summing over all possible production modes, in a dashed-brown line.We see now, that owing to the larger cross section of the GF mechanism (two orders of magnitude over ttH, factors of 15-25 for the modes involving gauge bosons), the overall number of events, AσL, is larger by an order of magnitude compared to the other modes.We also see that the impact of including all decay modes instead of only gluon fusion amounts to about 20 % of the total number of events.In view of our findings we will focus from now on only on the dependence of our results with the lifetime for a m π D = 5 GeV mass, and we will also include all Higgs production modes in our study.Boson Fusion (VBF, dashed) and associated production with a t t pair (dashed with dots).
We study now the sensitivity for the different decay portals considered in ref [12].To that end we present in figure 7 the efficiencies as a function of the dark pion lifetime, for m π D = 5 GeV.In order to obtain reliable estimates for these acceptances, we have simulated 10 7 Monte Carlo events per parameter space point.
Of the possible decay portals, we then find that the sensitivity is larger (and similar) for the dark photon and gluon portals, and lower (and similar) for the vector and Higgs portals.Further details on the kinematics difference between the different portals can be found in Appendix B. We then will select in what follows the gluon (G) and Higgs (H) decay portals, as they correspond to the extreme values for the efficiencies for the four portal scenarios considered.These two decay portals correspond to the following operators In the gluon portal one expects a showered enriched with SM hadrons produced from the produced gluons, while in the Higgs portal the decays would follow a Yukawa-like structure, and one can expect a shower enriched with heavy flavour quarks.Using the acceptance from figure 7, we show in figure 8 the excluded exotic Higgs branching ratio as a function of the lifetime, for a dark pion mass of 5 GeV.The solid line is using the existing dataset from the EJ search (16.1 fb −1 ).For comparison we show the ATLAS limit of 0.21, which was obtained with a 8 times larger dataset (139 fb −1 ), shown in red dashed.For a fair comparison we rescale our EJ limit to this luminosity (dashed  are shown as a hatched grey (green) region for the Higgs (gluon) portal.The dashed lines correspond to 139 fb −1 , the luminosity used on the current model-independent exclusions on undetected (also called "BSM") Higgs branching ratios from ATLAS and CMS, while the dotted lines correspond to the projection to 3000 fb −1 , which we compare with the HL-LHC reach of the BSM Higgs branching ratios, shown with a red band that encapsulates the different assumptions on the systematic uncertainties (see main text for details).
large lifetimes (cτ ≳ 400 mm).For clarity reasons we have refrained from showing HL-LHC extrapolations from CheckMATE, but they would only be more sensitive than the BSM Higgs study for lifetimes in the 300-700 mm range, with the exact value depending on the final HL-LHC limit.Nonetheless, the phenomenological picture is similar to the one with the current dataset: for low lifetimes the BSM Higgs limit dominates, in an intermediate regime the EJ reinterpretation takes over, and for longer lifetimes the BSM Higgs searches become more sensitive, with missing energy searches becoming relevant for long-lifetimes.As stressed before, exotic Higgs decays are not a target of the EJ analysis, and hence it would be interesting to consider the use of emerging jet taggers in other production modes.
We leave this option for future work.

Conclusions and Outlook
In this work we have performed detailed studies focused on the reinterpretation of the CMS emerging jet search.This signature belongs to the class of signatures that are collectively dubbed as "dark showers", which stem from having a strongly-interacting dark (secluded) sector.In this dark sector new matter (and gauge) fields are added, which are assumed to hadronize, like in the SM strong sector.In particular, emerging jets correspond to the case where the dark sector mesons are have macroscopically appreciable decay lenghts, which make these final states also fall in the class of exotic phenomena dubbed "long-lived particles" (LLPs).
Our reinterpretation procedure has been validated by carefully following the CMS study.We have obtained good agreement with the published distributions on the ⟨IP 2D ⟩ and α 3D variables, and also reproduced the publicly available efficiencies for the benchmark model employed in the search.We have reproduced the published exclusion limits through two different routes, one by employing directly the CMS published efficiencies and another one by computing the efficiencies ourselves through our own Monte-Carlo simulation.Here there is a large uncertainty in the exact parametrization of the tracking efficiency.We have attempted a few different parametrizations, and employed the one that, while possibly oversimplified, can reproduce the published efficiencies (and exclusion limits) with a reasonable accuracy.
We would like to stress that while the relevant information of the CMS study was publicly available and clearly explained, getting in contact with the authors of the experimental study was nonetheless needed in order to comprehend a few crucial details.Their response has been instrumental to understand details concerning the track efficiency and the impact parameter smearing used in the study.Since it would be desirable that a reinterpretation of an experimental study can be done without this contact (as it can happen that the main authors of a given analysis might not be always part of the collaboration), we also took the opportunity to comment in the text for which aspects a clarification was needed, and which additional material would have helped us to carry our the reinterpretation.
Using our validated pipeline, we have focused on the exploration of a SM Higgs boson decaying into two dark quarks (fermions charged solely under the new strong sector, akin to the SM quarks).To that extent, we have considered the inclusive production of the Standard Model Higgs from gluon fusion, Higgs-strahlung, vector-boson-fusion and associated production with a t t pair, and analyzed four decay benchmark portal models proposed in [12], which are dubbed gluon, dark photon, Higgs and vector portals.We have found that, while the efficiencies for the Higgs production rank in the 10 −3:−5 range, owing to the large production cross section we can obtain meaningful bounds in the relevant parameter space, which are competitive with the current exclusion on undetected Higgs branching ratio of 16 %, set by the ATLAS and CMS collaborations.We have checked, with the help of CheckMATE, that the existing prompt searches can bring meaningful bounds only for the large lifetime regime, cτ π D ≳ O(100mm).We have also considered the existing HL-LHC extrapolations for the undetected Higgs branching ratio, and compared them with a similar naive extrapolation of the emerging jet search sensitivity (relying only on statistical uncertainties being present).Yet, it is expected that the HL-LHC will have a number of improvements to detect long-lived particles, which could render the final projections better than our naive extrapolations.
As a byproduct of our analysis, we have made publicly available our Pythia 8 analysis code in the LLP Recating Repository [15], which can be used to compute the experimental acceptance (and the exclusion limits) with arbitrary BSM models, provided they are implemented in Pythia8.
We would like to stress that the exotic Higgs decay exclusion from [47] is an indirect bound, based on a global fit to the observed Higgs properties.Hence, if a signal is detected, its characterization would require an independent study.In contrast, if the emerging jet search starts seeing an excess, one can already infer that a new long-lived object is being produced from a Higgs boson decay, information that is crucial for the proper characterization of a putative BSM signal.We end by noting that the EJ requirements of having four hard jets do not precisely target the exotic decays of a SM Higgs boson.In spite of the analysis not being optimal, we see that we can exclude exotic branching ratio of 30 % in the gluon and dark photon decay portals, which can go down to the percent level for HL-LHC.Therefore, it might be worthwhile to explore EJ searches that focus on dark quark decays from a SM Higgs boson (or from a new scalar), which could have higher sensitivity than the model independent search for undetected Higgs branching ratios.
Based on these requirements, CMS further defines signal regions (called "sets" in the CMS paper), where a given EMJ criteria is accompanied by a set of cuts on the jets, requiring either two emerging jets, or one emerging jet plus large missing transverse energy.Those definitions are shown in The information from these tables has been included in the companion code uploaded to the LLP Recasting Repository [15].We have also collected there the details on the different tracking efficiency parametrization employed in this work.

B Emerging jets kinematics
In this Appendix we provide further details on the kinematic differences between the benchmark model proposed by Schwaller, Stolarski and Weiler [3] (SSW) used by CMS, and the Higgs-mediated dark showers we employed in our study.
We start by showing the angular distance ∆ R between the two hardest emerging jets in figure 9.Here we fixed the lifetime of the dark pion at 10 mm, which is the value at which the efficiencies peak, and the emerging jets are reconstructed here using the requirements EMJ-1, see Table 1.We show these distributions at four different steps of the event selection.The blue curve show all events that come out of the Monte Carlo simulation, without imposing any cuts, n T , where we request the presence of two emerging jets.The orange curve show the events that pass the H T > 900 GeV trigger, H T,t .The green curve shows those events where the p T conditions on each jet are applied, using Set # 1 from Table 2. Finally, in the red curve we request that the two tagged emerging jets are included in the set of the four hardest jets .For clarity reasons we have normalized all the distributions to unity, but in the legend we show in parentheses the number of expected signal events for the luminosity of the CMS study, 16.1 fb −1 .We consider here the decay portal with the largest efficiency (gluon portal) and compare gluon fusion production (upper left) with associated Higgs production with a t t pair (upper right).For comparison purposes we also show results for the SSW model used by CMS, in their m X = 1 TeV benchmark, and show the impact for lower masses, taking m X = 400 GeV (lower right).Gluon-portal, TTH production (L = 16.1fb Figure 9: ∆R between the hardest two emerging jets after applying different selection cuts (consecutively).We present results for the gluon portal decay mode, for gluon fusion production (upper left) and ttH production (upper right); and for the SSW model for the m X = 1 TeV (lower left) and m X = 400 GeV (lower right).We also show the expected number of events, with L = 16.1 fb −1 , in parenthesis.
From the figure we see that indeed the large boost of the Higgs when the H T trigger condition is applied forces the EJ to be highly boosted, as now the distribution falls very fast with ∆R.This suggests that the use of jet-substructure techniques (see e.g.[48]), which have already been applied to semi-visible jets in [49], could be of great help to increase the sensitivity for emerging jets.We also see from the numbers within parenthesis that the overall acceptance is driven by the large H T requirement.For gluon fusion the H T cut has an efficiency of 3.5 × 10 −3 , out of the overall 1.24 × 10 −4 efficiency.For ttH, we see that the H T cut has an efficiency of 6.7 × 10 −2 , out of the overall 1 × 10 −3 efficiency.We note that this behaviour happens as well for the other production modes not shown here (VBF, WH, ZH).We can see the impact of the H T trigger on the distributions shown in figure 10.Indeed, the H T shape that clearly distinguishes ttH from the other production mechanism (and also between different mediator masses in the SSW model) explains the outcome of our acceptance plots.
It is also important to note that the impact of H T also drives the efficiency for the different decay portals.In table 3 we show the cutflow for the higgs, gluon, darkphonton and vector portal, in gluon fusion production.Once again, we see that the H T trigger requirement is what drives the overall efficiency.This provides an important motivation to consider other triggers for Higgs-mediated dark showers.Finally, we would like to provide some insight on the kinematic of the different decay portals.We show the dark pion multiplicity, and the track (within each emerging jet) multiplicity in figure 11, fixing the production mode to gluon fusion, and considering the gluon (blue), higgs (orange), dark photon (green) and vector (red) portals.In addition we also include for comparison purposes the SSW model with the 400 GeV and 1 TeV mediator masses.From the left panel we see that the overall shower multiplicity is governed mostly by the Higgs mass, irrespective of the decay portal.This can be seen when comparing the two SSW benchmarks, how the multiplicity decreases with the mediator mass.The difference between the portals, however, appears when considering the track multiplicity within each emerging jet.Since the track kinematics is being used to actively tag the jets, these differences explain why the gluon and dark photon portals have larger acceptance than the Higgs and vector portals.Figure 11: Dark pion (left) and track multiplicity (right) for Higgs production through gluon fusion in the gluon (blue), higgs (orange), dark photon (green) and vector (red) portals; and also for the SSW model with m X = 400 (dashed purple) and m X = 1 TeV (dashed brown).

Figure 1 :
Figure1: Published signal exclusions from CMS (solid black) and those obtained using the CMS acceptances and for pp → XX i) leading order cross section in green and ii) NLO cross section in blue.

Figure 2 :
Figure2: Geometry of the considered variables.A SM jet and a dark pion π D originates from (r, z) = (0, 0); the latter decays at (r tr , z tr ) into three tracks.We illustrate here how the ⟨IP 2D ⟩ variable is obtained from the transverse impact parameter of the individual tracks.See main text for details.

2 m X = 1 Figure 3 :
Figure 3: Comparison between the CMS published simulations and our Monte Carlo setup, for ⟨IP 2D ⟩ (left) and α 3D (right) variables.For concreteness we only show the results for the benchmark point of m X = 1 TeV, m π D = 5 GeV and cτ π D = 25mm, yet a similar level of agreement between the CMS data and our simulation was found for all benchmark points from[7].We display results for the tracking efficiency parametrizations It 5 (solid blue), It 4 (solid orange), R (solid green) and A (dashed red).In the left panel, we include as well the results for the QCD jets coming from the X decays in dashed purple.

Figure 4 :
Figure 4: Ratio of our A i over those reported by CMS, for m π D = 5 GeV (left panels) and 95 % exclusion limits (right panels) in the m X − cτ plane.The tracking efficiency parametrization It 5 , It 4 and R have been used for the upper, middle and lower panels, respectively.In the left panel the colored bar indicates the value of A i and the displayed indicates the ratio of our A i normalized to the CMS one.In the right panel we show several exclusion limits: the CMS published one (solid black), the one obtained using A i CMS efficiencies with NLO predictions from figure 1 (dashed green), as well as those derived using our A i with LO (dashed pink) and NLO (solid pink) cross sections.Both It 4 and It 5 tend to overestimate the acceptance (hence the exclusion limit), while the R criteria reproduces better the exclusion curves.The disagreement is larger for the regions of lower masses and/or large lifetime, where the A i are below the per-mille level.

Figure 5 :
Figure 5: Maximum acceptance shown as a function of cτ π D (left) and m π D (right) in the Gluon Portal decay Benchmark for the Higgs production through Gluon Fusion (GF, solid), associated production with a Z,W boson (ZH: dot-dashed, WH: dotted), Vector Boson Fusion (VBF, dashed) and associated production with a t t pair (dashed with dots).

Figure 6 :
Figure 6: Maximum acceptance times production cross section times the total integrated luminosity of the EJ search (16.1 fb −1 ) shown as a function of cτ π D (left) and m π D (right) in the Gluon Portal decay Benchmark for the Higgs production through Gluon Fusion (GF, solid), associated production with a Z,W boson (ZH: dot-dashed, WH: dotted), Vector Boson Fusion (VBF, dashed) and associated production with a t t pair (dashed with dots).

Figure 8 :
Figure8: 95 % C.L limits on BR(h → q D q D ) obtained by reinterpreting the CMS emerging jet search, for the gluon portal and the Higgs portal as a function of cτ π D with m π D = 5 GeV (left) and as a function of m π D with cτ π D = 25 mm (right).The solid lines use the existing data, with a luminosity of 16.1 fb −1 , which correspond to the dataset of the current emerging jet search.The bounds from existing prompt searches, obtained with CheckMATE2[41] are shown as a hatched grey (green) region for the Higgs (gluon) portal.The dashed lines correspond to 139 fb −1 , the luminosity used on the current model-independent exclusions on undetected (also called "BSM") Higgs branching ratios from ATLAS and CMS, while the dotted lines correspond to the projection to 3000 fb −1 , which we compare with the HL-LHC reach of the BSM Higgs branching ratios, shown with a red band that encapsulates the different assumptions on the systematic uncertainties (see main text for details).

Figure 10 :Table 3 :
Figure10: H T distributions (normalized to unity) before applying selection cuts and the trigger condition.We show for the gluon portal, the five production mechanisms considered, and we also show the SSW benchmarks with 400 and 100 GeV mediator masses.The number in parenthesis in the legend indicates the efficiency of the H T > 900 GeV trigger selection.

Table 2 .
The event yield excluded in each signal region at the 95 % C.L. by CMS is shown in the rightmost column, S 95 .

Table 2 :
Signal regions defined in the Emerging Jet study by CMS.The S 95 column indicates the 95 % C.L on the expected amount of signal events, which we employ for limit setting.