Search for resonant production of strongly coupled dark matter in proton-proton collisions at 13 TeV

The first collider search for dark matter arising from a strongly coupled hidden sector is presented and uses a data sample corresponding to 138 fb−1, collected with the CMS detector at the CERN LHC, at s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \sqrt{s} $$\end{document} = 13 TeV. The hidden sector is hypothesized to couple to the standard model (SM) via a heavy leptophobic Z′ mediator produced as a resonance in proton-proton collisions. The mediator decay results in two “semivisible” jets, containing both visible matter and invisible dark matter. The final state therefore includes moderate missing energy aligned with one of the jets, a signature ignored by most dark matter searches. No structure in the dijet transverse mass spectra compatible with the signal is observed. Assuming the Z′ boson has a universal coupling of 0.25 to the SM quarks, an inclusive search, relevant to any model that exhibits this kinematic behavior, excludes mediator masses of 1.5–4.0 TeV at 95% confidence level, depending on the other signal model parameters. To enhance the sensitivity of the search for this particular class of hidden sector models, a boosted decision tree (BDT) is trained using jet substructure variables to distinguish between semivisible jets and SM jets from background processes. When the BDT is employed to identify each jet in the dijet system as semivisible, the mediator mass exclusion increases to 5.1 TeV, for wider ranges of the other signal model parameters. These limits exclude a wide range of strongly coupled hidden sector models for the first time.


Introduction
Over the past several decades, precision collider studies of elementary particles, such as the electroweak bosons, the top quark, and the recently discovered Higgs boson, have demonstrated remarkable agreement with theoretical calculations from the standard model (SM) of particle physics. However, many astronomical observations -including galaxy rotation curves [1,2], strong (weak) gravitational lensing observations of galaxy cluster collisions [3] (large-scale structures [4]), and the cosmic microwave background power spectrum [5] -indicate the existence and prevalence of dark matter (DM). The SM is not sufficient to explain these observations, which provides a compelling motivation to search for DM in the context of physics beyond the SM.
Various extensions to the SM that include DM and are compatible with the astronomical data allow proton-DM scattering, DM-DM annihilation, or DM production at colliders. The possible masses of the DM particles and the particles that mediate the proton-DM interactions span at least 36 orders of magnitude [6]. Collider searches have primarily focused on models with weakly interacting massive particles (WIMPs). These WIMPs are typically predicted to fall in a favorable mass range for collider production [7] and present relatively clear signatures with missing momentum recoiling against visible SM particles. The CMS experiment has examined such signatures using different visible objects, including a jet or vector boson [8-10], a Higgs boson [11], a top quark [12], or jets from vector boson -1 -
Hidden valley (HV) theories [20] are alternative scenarios that propose dark sectors with potentially multiple new particles and new forces, decoupled from the SM except for mediator particles. Searches conducted by the LHCb, ATLAS, and CMS experiments have not found evidence for models with dark photons, predicted by the simplest dark sectors with a new U(1) force [21][22][23][24][25][26][27][28][29]. Alternatively, the dark sector may contain a new confining force SU(N ), an analog to quantum chromodynamics (QCD) in the SM. These models are largely unexplored in experimental searches; they generate dark showers, which present a wide range of novel kinematic signatures [30].
Several considerations motivate the search for a strongly coupled dark sector at the LHC. The abundance of visible matter arises from a baryon asymmetry and a similar mechanism may explain the abundance of dark matter [31]. Cosmological measurements of the dark matter density have established it to be of the same order as that of the visible matter, roughly a factor of five larger [32]. This observed similarity with visible matter suggests that dark matter may consist of composite particles. In some cases, the scale of the new confining force, also called dark QCD, may be related to SM QCD [33], favoring scales on the order of 10 GeV and mediator masses at the TeV scale. Generically, these models can produce the correct DM relic density [34].
In this paper, we consider the class of models proposed in refs. [35,36], in particular the resonant production process qq → Z → χχ depicted in figure 1. This process involves a leptophobic Z boson mediator arising from a broken U(1) symmetry, with couplings to the SM quarks g q and dark quarks g χ . The dark sector contains several flavors of dark quarks (χ 1 , χ 2 , . . . ) that form bound states called dark hadrons, which may be either stable or unstable. The unstable dark hadrons decay promptly to SM quarks, while the stable dark hadrons are DM candidates that traverse the detector without interacting. This leads to collimated mixtures of visible and invisible particles, known as "semivisible" jets (SVJ). The heavy Z boson tends to be produced at rest, resulting in two jets that are back-to-back in the transverse plane. The total amount of missing transverse momentum is expected to be moderate because both jets contain invisible particles, so a portion of the transverse component of the overall missing momentum will cancel. Therefore, the signature of this model is a pair of jets along with the missing transverse momentum that is aligned with one of the jets. The jets are expected to be wider than typical SM jets because they arise from a multi-step process: the dark quarks shower and hadronize in the dark sector, the unstable dark hadrons decay to SM quarks, and finally the SM quarks shower and hadronize visibly. The SM quarks are much less massive than the dark hadrons, so they are produced in the intermediate decay step with significantly higher relative momentum, spreading out the SM particle showers.
The fraction of stable, invisible dark hadrons, r inv = N stable /(N stable + N unstable ), can take any value between 0 and 1, as described in ref. [35]. Since events containing jets aligned with missing transverse momentum are explicitly rejected from the signal regions of existing collider DM searches, a significant portion of the parameter space of this model is not covered, particularly intermediate values of r inv . The current DM searches are primarily sensitive to dark sector models with r inv ≈ 1 that yield events with high missing momentum along with some initial-state radiation. The current dijet searches [37,38] are sensitive to dark sector models with r inv ≈ 0 that yield events with two jets and low missing momentum. Therefore, the search presented here complements these efforts. The case where the unstable dark hadrons decay after some finite lifetime results instead in emerging jets [39], for which a search has already been conducted by CMS [40] using 13 TeV data corresponding to an integrated luminosity of 16.1 fb −1 . That search excluded mediator particles with masses up to 1.25 TeV for dark hadrons with masses between 1 and 10 GeV and decay lengths between 5 and 225 mm. We analyze data from proton-proton (pp) collisions delivered by the CERN LHC in 2016-2018 and collected with the CMS detector. The data sample corresponds to an integrated luminosity of 138 fb −1 [41]. The search is conducted following a dual strategy, in which we present two sets of results. First, a generic or inclusive search is sensitive to any model that presents the signature of jets aligned with missing transverse momentum. Second, a dedicated search for the specific class of dark sector models introduced above uses a boosted decision tree (BDT) to distinguish semivisible jets from SM jets using jet substructure variables. The SM background is dominated by events in which a mismeasured jet causes artificial missing momentum. These events are composed only of jets produced through the strong interaction, which we will describe as the QCD multijet process. There are additional background contributions from the tt, W(→ ν)+jets, and Z(→ νν )+jets processes, which produce events with genuine missing momentum associated with neutrinos. Tabulated results are provided in the HEPData record for this analysis [42].

Signal model
The implementation of the Z signal HV model is based on ref. [35]. The mediator mass m Z and couplings g q and g χ are undetermined parameters. The production cross section σ Z is determined by m Z and g q , while the branching fraction B dark = B(Z → χχ) depends on both g q and g χ . The effective cross section is defined as the product σ Z B dark . The dark QCD force is carried by a dark gluon, and binds the dark quarks into dark hadrons, which can be pseudoscalars (π) or vectors (ρ). The unstable dark hadrons are π dark and ρ dark , respectively, while the stable dark hadrons are π DM dark and ρ DM dark . The dark hadrons are assumed to be degenerate in mass, with the mass scale m dark as another parameter. The coupling strength of the dark QCD force, α dark , is also undetermined; its value primarily influences the dark shower dynamics by modifying the multiplicity and the transverse momentum p T of the dark hadrons, as well as the width of the resulting spray of particles. The last parameter of interest is the fraction of invisible dark hadrons, r inv . The variation of this fraction is not considered in other theories, making it the most novel parameter. In summary, the leptophobic Z model has five parameters: the effective cross section, m Z , m dark , α dark , and r inv .
For the search presented here, we restrict the range of the mass scale m dark between 1 and 100 GeV; the lower bound is determined by modeling constraints in the pythia [46] generator, while the upper bound ensures enough hadrons are produced to form a dark shower and the corresponding semivisible jet. The mass of the dark quark species is fixed at m χ = m dark /2, following ref. [35]. The running coupling α dark can also be expressed in terms of the dark coupling scale Λ dark , which has dimensions of energy, as α dark ( Here, we choose the number of dark colors N dark c = 2, the number of dark flavors N dark f = 2, and the scale Q dark = 1 TeV. The effects from changing Λ dark depend strongly on the dark hadron mass, so we choose the scale to maximize the number of dark hadrons in the shower for each m dark value. An analytic solution for this requirement is not known, so an approximate numerical relationship Λ peak dark = 3.2(m dark ) 0.8 GeV, with m dark in GeV, is obtained using pythia. The benchmark value α peak dark is computed from Λ peak dark with the above equation, and we consider variations of ±50% (α low dark , α high dark  [47]. In particular, the branching fraction is B dark = 47% and the mediator width is Γ Z /m Z = 5.6%, for which the narrow width approximation holds [48]. The difference between the g χ value and the LHC DM coupling benchmark value g DM = 1.0 accounts for the multiple flavors and colors of dark quarks.
Several features of the HV model are implemented by modifying pythia parameters for particle branching fractions. The dark hadrons may be stable or unstable. The unstable vector dark hadrons ρ dark can decay to a pair of SM quarks of any flavor with equal probability, as long as m dark ≥ 2m q . In contrast, the unstable pseudoscalar dark hadrons π dark must decay through a mass insertion, and therefore decay preferentially to more massive SM quarks [35]. This preference is reflected in their branching fractions, which are calculated including the effect of quark mass running [49]. To simulate the stable dark hadrons, we reduce the dark hadron branching fraction to SM quarks according to the value of the r inv parameter. The probability of producing a vector dark hadron is set to 0.75. Finally, a Z 2 symmetry is explicitly enforced by rejecting events with an odd number of stable dark hadrons.

Simulation
We use simulated samples to model signal and SM background processes. Because of changes in detector conditions, including detector upgrades and aging, we use separate samples for each year of data taking: 2016, 2017, and 2018. The detector response to simulated particles is modeled using the Geant4 software [50]. Custom simulations of the detector electronics are used to produce readouts similar to those observed in data. Multiple pp interactions (pileup) are also included in the simulation. The simulated samples are corrected to make the pileup distribution match the distribution in data as closely as possible.
The MadGraph5_amc@nlo event generator [51] with the MLM matching procedure [52] is used to simulate the tt, W(→ ν)+jets, and Z(→ νν )+jets processes with leading-order (LO) matrix element calculations including up to three, four, and four additional partons, respectively. The 2016 samples were generated with MadGraph5_amc@nlo 2.2.2 and the 2017 and 2018 samples with MadGraph5_amc@nlo 2.4.2. These samples are normalized to the next-to-next-to-leading-order (NNLO) cross sections [53][54][55][56][57][58][59]. The pythia generator is used to simulate the QCD multijets process at LO as a 2→2 interaction, with versions 8.212, 8.226, and 8.230 employed for the 2016, 2017, and 2018 simulated samples, respectively. The QCD multijet simulation is normalized to the LO cross section provided by pythia, and an empirically estimated correction of 1.2-1.9, taking into account different detector conditions during the three data-taking periods, is applied to account for the difference in the order of the cross section computation relative to the other backgrounds. The same versions of pythia are used to simulate parton showering and hadronization   [63] and the CP5 tune [64]. The simulations of SM background processes are used to compare to signal in order to optimize various kinematic criteria for sensitivity, to check the agreement with data for basic kinematic variables, and to train the BDT. The simulated signal samples are also generated with pythia, using version 8.226 for 2016 and 8.230 for 2017 and 2018. The dedicated HV module is used for the showering and hadronization of the dark hadrons. Compared to the SM background simulation, newer versions of pythia are used for the signal simulation in order to incorporate essential features, such as the running of the dark coupling. The 2016 samples use the NNPDF2.3 LO PDF and the CUETP8M1 tune, and the 2017 and 2018 samples use the NNPDF3.1 LO PDF [63] and the CP2 tune [64]. The samples are normalized using next-to-leading-order cross sections computed with a universal coupling between the Z boson and the SM quarks.
As described in section 3, the signal model has five free parameters, four of which affect the physical properties of the resulting events, while the effective cross section just affects the total yield of events containing semivisible jets. We define the benchmark values for these parameters as m Z = 3.1 TeV, m dark = 20 GeV, r inv = 0.3, and α dark = α peak dark , where α peak dark ≈ 0.313 for m dark = 20 GeV. These values are chosen to have high acceptance for the selection defined in section 6. To keep the number of concrete models to a reasonable level, we generate simulated samples varying only two parameters at a time, producing three two-dimensional "scans" of the model parameter space shown in table 1. These scans will be used to interpret the results of the search, with the search variable distributions for each signal model formed directly from the simulated events.

Event reconstruction and triggering
The events recorded with the CMS detector are reconstructed using the particle-flow (PF) algorithm [65], which aims to identify every particle in each event. The PF algorithm combines information from all subdetector systems in an optimized manner. Each reconstructed particle, also called a PF candidate, is identified as a charged hadron, electron, muon, neutral hadron, or photon. The anti-k T algorithm [66,67] is used to cluster the PF candidates into jets. The missing transverse momentum vector p miss T is computed as -6 -JHEP06(2022)156 the negative vector p T sum of the PF candidates in an event; its magnitude is denoted as p miss T [68]. In order to improve the quality of the PF reconstruction, additional identification criteria are applied to electron [69] and muon candidates [70], which are required to have p T > 10 GeV and |η| < 2.4.
The candidate vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex, where the physics objects used for this determination are the jets and the missing transverse momentum. For the vertex calculation, the tracks assigned to each vertex serve as input to the jet clustering algorithm, and the missing transverse momentum is the negative vector p T sum of the resulting jets. Only chargedparticle tracks associated with the primary vertex are considered when reconstructing the final collection of PF candidates. Rejecting tracks associated with other vertices reduces the impact of pileup.
Two types of jets are used in this search. The primary collection of jets with p T > 200 GeV is reconstructed using a distance parameter of R = 0.8 for the clustering algorithm, because the signal jets are expected to have a broader radiation pattern than the SM background jets. We denote these reconstructed objects with a capital J, and we refer to them simply as "jets" in the rest of the paper. The pileup-per-particle identification (PUPPI) algorithm [71] is used to mitigate the effect of pileup on the jets. The PUPPI algorithm makes use of a local shape variable that reflects the collinear versus soft diffuse structure in the neighborhood of the particle. With this variable, as well as with event pileup properties and tracking information at the reconstructed particle level, the algorithm computes the probability that a given neutral PF candidate results from pileup. The candidate momentum is multiplied by this probability [72]. The jets are further corrected to account for nonlinearities in the detector energy response [73]. In simulated samples, the jet p T is smeared using measured resolutions to match the observed data. A set of quality criteria is applied to reject spurious jets arising from instrumental sources [74].
The jet clustering algorithm is also employed with a distance parameter of R = 0.4 to produce a second collection of jets, which are then required to have p T > 30 GeV. We denote these reconstructed objects with a lowercase j, and we refer to them as "narrow jets" in the rest of the paper. The narrow jets are employed in the trigger and are also used to mitigate instrumental backgrounds, which are unrelated to the expected size and scale of the signal jets. Energy corrections are also applied to the narrow jets, with an area-based pileup subtraction [72,75] used in place of the PUPPI algorithm. The effect of the narrow-jet energy corrections is propagated to the missing transverse momentum.
The data set considered in this paper is collected using triggers that place requirements on the narrow-jet p T or on the H T , which is the scalar p T sum of all narrow jets with p T > 30 GeV and |η| < 3.0. The triggers used in 2016 (2017-2018) require a jet with p T > 450 (500) GeV or H T > 900 (1050) GeV. The thresholds were raised in 2017 to compensate for higher instantaneous luminosity. The efficiency for an event to pass any of these trigger conditions is measured in data using a data set collected with an independent trigger that requires a muon with p T > 50 GeV. The selection requirements that ensure a high trigger efficiency are described in the next section.

Event selection
We select events with at least two jets. Any event in which the two highest p T jets are not both within |η| < 2.4 is rejected. Events with missing transverse momentum that have substantial energy in the endcap region are significantly more likely to originate from beam halo or noise induced by the radiation damage, rather than from signal events, which tend to populate the barrel region. Therefore, this requirement eliminates several sources of instrumental background, while having a negligible impact on the signal efficiency.
The dijet transverse mass m T is used as the search variable. In signal processes, it forms a peak with a kinematic endpoint at m Z and therefore approximately reconstructs the mass of the Z mediator, while in background processes, it is expected to have a smoothly falling spectrum. This can be observed in figures 5-6 in section 8. The dijet transverse mass is computed from the four-vector of the dijet system and the p miss T [35]: Here, m JJ is the invariant mass of the system composed of the two highest p T jets, and p T,JJ is the vector sum of their p T . The quantity E 2 T,JJ is defined as m 2 JJ + | p T,JJ | 2 , while we assume that there is just one massless, invisible particle in the final state, implying the missing transverse energy E miss T = p miss T . This enables the simplification in the second line of eq. (6.1), with φ JJ,miss as the azimuthal angle between the dijet system and the missing transverse momentum. Bins of width 100 GeV are used when measuring the m T distribution; this width provides sufficiently populated bins for the majority of the spectrum.
The transverse ratio is defined as R T = p miss T /m T and used in the selection instead of p miss T to identify events with invisible particles. This is because p miss T is correlated with m T , and therefore placing a requirement directly on p miss T would shift the peak of the background distribution to higher m T values, as well as causing the distribution to have a shallower slope. The requirement R T > 0.15, shown in figure 2 (left), rejects 99% of simulated QCD background events. In particular, this requirement removes the majority of t-channel QCD events, which have higher m T than s-channel events, but still low p miss T , and therefore lower R T values. Accordingly, the proportion of background events from the tt, W+jets, and Z+jets processes, collectively described as the electroweak background, is enhanced.
The remaining t-channel QCD events have low trigger efficiency, because their high m T values arise from the large ∆η(J 1 , J 2 ) = |η J 1 − η J 2 | separation between the two highest p T jets, rather than from the jet p T that is the basis of the trigger. To reject these events, we additionally require ∆η(J 1 , J 2 ) < 1.5. With these requirements applied, the efficiency of the jet-based triggers is measured in the data to have a plateau at 98% for m T > 2.0 TeV. Events with m T > 1.5 TeV are selected to be in the fully efficient region, defined as an efficiency of at least 95% of the plateau value, in order to ensure that the background events will have a falling spectrum. The simulated signal and background events are corrected to match the measured trigger efficiency for each year of data taking. In order to reduce the tt and W(→ ν)+jets backgrounds, we veto events containing electrons or muons passing the kinematic and identification criteria noted in section 5, along with an additional requirement that the lepton candidates are isolated from other particles. In other words, we require the number of electrons N e to equal 0 and the number of muons N µ to equal 0. Isolation is applied to reduce the contributions from jets misidentified as leptons, or leptons arising from hadron decays rather than prompt W boson decays. It is quantified using the relative variable I mini ( ) = ( ∆R max p T,h ± + max[0, ∆R max p T,γ + p T,h 0 − p T,pileup /2])/p T, , where the different particle candidates are denoted by for the charged lepton, h ± for charged hadrons, γ for photons, and h 0 for neutral hadrons. The label "pileup" in the last term denotes charged hadrons that do not originate from the primary vertex and are used to estimate additional energy coming from neutral candidates whose vertex cannot be determined. All the sums consider candidates within a cone of ∆R = (∆η) 2 + (∆φ) 2 < ∆R max , where ∆R max is 0.2 for p T, ≤ 50 GeV, 10 GeV/p T, for 50 < p T, < 200 GeV, and 0.05 for p T, ≥ 200 GeV. The radius of the cone decreases with increasing charged-lepton p T in order to account for the effect of the Lorentz boost of the lepton's parent particle; a larger boost leads to more collimated decay products [76]. We require I mini (µ) < 0.4 or I mini (e) < 0.1. The lepton vetoes reject 40% of the tt and W(→ ν)+jets background events that remain after applying the selection requirements described previously in this section. The events that cannot be rejected contain "lost" leptons that fail some aspect of the kinematic, identification, or isolation criteria.
Events with anomalously high p miss T values can occur due to a variety of reconstruction failures, detector malfunctions, or other noncollision backgrounds. These events are rejected by dedicated filters designed to identify at least 85% of the anomalous events, while having a false positive rate less than 0.1% [68]. The leading source of instrumental

JHEP06(2022)156
background is mismeasurement of jet energy because of nonfunctional ECAL readout channels (c nonfunctional ). We employ a filter, optimized for the kinematic phase space of this search, to identify the nonfunctional channels. This approach relies on the fact that the production of jets in physical SM multijet processes is expected to be isotropic in φ. Because of the R T requirement for nonnegligible missing transverse momentum, multijet background events will be preferentially selected if one of the jets is mismeasured because it overlaps with a nonfunctional channel. Therefore, regions with nonfunctional channels can be identified as enhancements in the η-φ distributions of the two highest p T narrow jets in each event. In this custom filter, an enhancement is defined as a bin having a value approximately five standard deviations larger than the mean value for the distribution. Events are rejected if either of the two highest p T narrow jets overlaps with a nonfunctional channel identified in this way, determined by ∆R(j 1,2 , c nonfunctional ) < 0.1. This custom filter eliminates 40% of the remaining QCD background, while only reducing the signal efficiency by approximately 5%.
The requirements described above constitute the initial selection or "preselection". Preselected events are used to verify basic agreement in kinematic variables between data and simulation, as well as to train the semivisible jet identification algorithm, described in section 7. As the final selection, we apply the three additional criteria described below that further enhance the signal relative to the backgrounds.
First, an unphysical contribution from misreconstructed jets, which are found to produce events with artificially high m T values, is rejected by vetoing events where the leading narrow jet has a photon energy fraction f γ > 0.7 and p T > 1.0 TeV. These events, which appear only in the observed data and were identified using a control region with the requirement 1.5 < ∆η(J 1 , J 2 ) < 2.2, arise from rare failures in certain ECAL reconstruction algorithms. Second, in the later portion of the 2018 data-taking period, corresponding to 38.65 fb −1 , a section of the HCAL was not active. Because this effect was not included in the detector simulation, we reject events in which any narrow jet falls into the inactive section (−3.05 < η < −1.35, −1.62 < φ < −0.82), which reduces the signal efficiency by approximately 6% within the affected data-taking period. Third, the minimum angle between the jets and the p miss T is defined as ∆φ min = min[∆φ( J 1 , p miss T ), ∆φ( J 2 , p miss T )]. Events with ∆φ min < 0.8, shown in figure 2 (right), are selected because the missing momentum aligns with the jets in signal events. The signal efficiency for the benchmark signal model after the final selection is 17%.
We define two signal regions for the inclusive search, low-R T (0.15 < R T ≤ 0.25) and high-R T (R T > 0.25). Requiring R T > 0.25 is found to maximize the signal sensitivity when only one signal region is considered. However, including the low-R T events as a separate signal region still reduces the expected cross section exclusion by an additional 60% on average, and therefore we use both regions. The low-R T signal region also improves the sensitivity to signal models with r inv = 0, which have smaller, but nonnegligible p miss T from decays of heavy flavor hadrons, and therefore populate this region.
The values used in the requirements on ∆η(J 1 , J 2 ), ∆R(j 1,2 , c nonfunctional ), ∆φ min , and R T are optimized to ensure maximal separation between signal and background, using the figure of merit F = 2 [(S + B) ln (1 + S/B) − S] [77]. F is computed by comparing the -10 -

Identification of semivisible jets
The selection requirements described in section 6 rely on event-level quantities or basic kinematic properties of the reconstructed jets, leptons, and missing transverse momentum. Although they reject the vast majority of the SM background events, the remaining background is still more than two orders of magnitude larger than the expected number of signal events for the benchmark signal model. In order to reduce the background even further, we exploit the intrinsic differences between semivisible jets and SM jets. A number of variables have already been derived in other contexts to characterize jets in terms of their substructure. Many of these jet substructure variables may be used individually to provide weak discrimination between semivisible and SM jets. To improve the discriminating power, the useful variables are combined in a BDT, whose output is an optimized discriminator with values close to 1 for semivisible jets and close to 0 for SM jets. Jets with a discriminator value higher than a chosen working point (WP) are considered to be "tagged" as semivisible. Semivisible jet substructure [78,79] and other tagging strategies [80] have also been explored at the phenomenological level.
The 15 BDT input variables, computed for each jet, originate from several categories. From heavy object identification, the N -subjettiness ratios τ 21 and τ 32 [81], the energy 3 [82], and the soft-drop mass m SD [83] are used. From quark-gluon discrimination, the jet girth g jet [84], the major and minor jet axes σ major and σ minor [85], and the p T dispersion D p T [85] are used. The flavor-related variables used are the jet energy fractions for each type of constituent identified by the PF algorithm: f h ± , f e , f µ , f h 0 , and f γ . The angle between the jet and the missing transverse momentum, ∆φ( J, p miss T ), is also included. Variables sensitive to the multiplicity of jet constituents are deliberately excluded, as this property is strongly correlated with the r inv parameter of the signal model.
As noted previously, semivisible jets are expected to have wider and softer radiation patterns compared to typical SM jets. This is reflected in the quark-gluon variables, with the signal jets exhibiting larger values of g jet , σ major , and σ minor , and smaller values of D p T . In addition, the signal jets are very unlikely to have multiple prongs in their substructure, which is reflected in their N -subjettiness and energy correlation function distributions. Unlike boosted heavy objects such as top quarks or W or Z bosons, whose mass is reflected in the m SD of the resulting jet, the mass of the originating dark quark is not reflected in the m SD distribution of semivisible jets. This difference occurs because semivisible jets are formed by strong hadronization processes in both the dark and visible sectors, rather than the two-prong weak decays of boosted heavy SM particles. However, in semivisible jets, the m SD can reflect the dark hadron mass m dark when a single hard fragmentation carries a significant portion of the jet energy. The jet energy fractions provide useful insight into the visible part of the signal jets, and because they are normalized to the total visible jet energy, they induce very little dependence on the signal model parameters. The distributions of m SD and D p T are shown in figure 3 for the two highest p T jets from the simulated SM backgrounds and several signal models, with the jet p T distributions weighted as described below.
The BDT is trained using the scikit-learn machine learning package [86] with the gradient boosting technique, and additional diagnostic information is provided by the rep software [87]. The inputs are the two highest p T jets from simulated signal and background samples, with the variables described above computed for each jet. We observe that the background simulation reproduces the observed distributions of the input variables. Only jets from events that pass the preselection, defined in section 6, are considered. The signal samples include various choices of parameters with relatively high acceptance for the preselection: m Z ≥ 1.5 TeV, m dark ≥ 10 GeV, 0.1 ≤ r inv ≤ 0.8, and all α dark variations. Each signal sample is weighted equally. Jets are classified as signal jets if they originate from one of the signal samples and if they are spatially associated with generator-level dark sector particles. A weight is assigned to each jet so that the p T distributions of the signal and background jets match that of a benchmark signal with m Z = 3.0 TeV. This prevents the BDT from learning differences in the p T distribution that arise from the originating process rather than the intrinsic nature of the jets, and it also prevents the BDT from forcing the background m T distribution to resemble that of the signal. The background jets are taken from an equal mix of the QCD and tt processes. It was found that training on only one of those two background processes caused the BDT to misclassify jets from the other background process as signal jets at a rate of 10-20%. The BDT hyperparameters are optimized to maximize the signal efficiency at a background efficiency of 10%, while requiring low values of variables that measure overtraining and the correlation of the background efficiency with m T .
The BDT exhibits strong and uniform rejection of jets from all of the SM background processes, while preserving a significant amount of the signal. The discriminator distribution and the receiver operator characteristic (ROC) curve are shown in figure 4. The five most important variables in the BDT are found to be the m SD , ∆φ( J, p miss T ), τ 21 , τ 32 , and D p T . Jet tagging algorithms are typically evaluated using three metrics: the accuracy (Acc.  Table 3. Metrics representing the performance of the BDT for the benchmark signal model (m Z = 3.1 TeV, m dark = 20 GeV, r inv = 0.3, α dark = α peak dark ), compared to each of the major SM background processes. and the background rejection (1/ε bkg ) at a signal efficiency (ε sig ) of 30%. These metrics are provided for each background process, compared to the benchmark signal process, in table 3. The signal efficiency of the BDT exhibits some additional dependence on m dark for several reasons. First, the important variable m SD is less useful to distinguish signal jets at very low m dark , where they more closely resemble QCD, and at very high m dark , where they more closely resemble boosted W boson jets from tt events. Second, at very low m dark , unstable dark hadrons cannot decay to b or c quarks, which changes the flavor content of the jets, including the jet energy fractions used in the BDT.
The final WP value is chosen to be 0.55; at this WP, the BDT rejects 84-88% of simulated background jets, while correctly classifying 87% of jets from the benchmark signal model. This choice retains sufficient events to facilitate the background estimation, which will be described in section 8. When the BDT is employed, we select subsets of the high-R T and low-R T inclusive signal regions by requiring that both jets in each event are tagged as semivisible. These signal regions are called high-SVJ2 and low-SVJ2. Events in which only one jet is tagged as semivisible are not found to provide significant additional sensitivity. Because semivisible jets do not occur in the SM, there is no viable approach to assess any difference in the tagger performance between data and simulation in signal events. Therefore, the results from the BDT-based regions assume that any residual differences are covered by the various other systematic uncertainties in the signal, which are described in section 9.

Background estimation
As described in the previous sections, QCD is the dominant SM background process, while the electroweak processes tt, W+jets, and Z+jets provide smaller but nonnegligible contributions. The proportion of each background varies depending on the signal region, with QCD comprising 83-88% in the low-R T and low-SVJ2 regions, but only 46-57% in the high-R T and high-SVJ2 regions. In each region, the m T distribution from each SM background processes is expected to be smoothly falling. The observed m T distributions from each data-taking period are added together in each signal region, producing summed distributions that are used for the final results.
We fit the observed m T distribution in each signal region with a functional form that represents this expected behavior, following the approach from traditional dijet searches, -14 -

JHEP06(2022)156
such as ref. [37]. The primary fit function is: TeV) and p 1 , p 2 , and p 3 are free parameters in the fit. The specific form of this function is chosen to improve numerical stability and reduce parameter correlations, in order to ensure a stable fit. To prevent the minimizer from finding false minima in the parameter space of the fit function, we adopt an approach in which many combinations of initial values, varying both sign and magnitude, are tested for each parameter, and the resulting fit with the lowest χ 2 value is chosen. Versions of the function with different numbers of free parameters are compared to each other, and the optimal number of parameters for the fit in each signal region is determined using the Fisher F -test [88]. The high-R T region requires all three parameters, while the other three signal regions use the two-parameter version of the function, with p 3 set to 0.
An analytic fit could produce a biased background prediction if it were not sufficiently flexible to adapt to observed data that actually followed a different distribution. To quantify this potential bias, we also fit the data in each signal region with two secondary functions, one similar to those used in refs. [35,37] and another based on the family used in refs. [89,90]. We then randomly generate artificial pseudodata sets distributed according to the secondary function fits, with a similar yield to the observed data. For each pseudodata set, a maximum likelihood fit is performed, including both the background prediction from the primary function and a signal model with varying cross section. When generating these data sets, two cases are considered: no signal events injected, or signal events injected at the nominal Z cross section. The method of varying initial values is also applied here, to ensure the best possible fit of the primary function to the secondary function. The result of each likelihood fit is an extracted cross section, which is compared to the injected cross section in the figure of merit b = (σ extracted − σ injected )/ε σ extracted , where ε σ extracted is the estimated uncertainty in the extracted cross section. It is found that |b| ≤ 0.5 for both secondary functions and both signal injection cases. The background prediction is therefore determined to be sufficiently unbiased.
The results of the background-only fits are compared to the observed data in figure 5 for the inclusive signal regions and in figure 6 for the BDT-based regions. Applying the BDT to tag semivisible jets reduces the background by almost two orders of magnitude. A new resonance would appear as a significant excess of events in a range of m T values, and no such excess with respect to the predictions is observed. The few events at high m T values in the high-R T region are consistent with the background prediction, within uncertainties. For 5 < m T < 8 TeV, we observe 6 events, while integrating the background-only fit predicts 8.4 +2.1 −1.4 ; for 6 < m T < 7 TeV, we observe 4 events, while the prediction is 2.0 +0.6 −0.4 . We proceed to set limits on the effective cross sections for different signal models in section 10.

Systematic uncertainties
The lower panel shows the difference between the observation and the prediction divided by the statistical uncertainty in the observation (σ exp ). The distributions from several example signal models, with cross sections corresponding to the observed limits, are superimposed.  ments and detector and reconstruction effects, often occurring when a correction is applied to handle some difference between data and simulation. Second, theory uncertainties reflect possible variations in the generation of signal events. Some uncertainties, called "flat" uncertainties, affect just the signal yield, while other uncertainties, called "shape" uncertainties, affect both the yield and distribution of signal events. Experimental shape uncertainties are considered to be uncorrelated between each year of data taking. Uncertainties that affect just the yield and all theory uncertainties are considered to be correlated between each year of data taking, because their sources are common across all years. Finally, the statistical uncertainty from the limited number of simulated signal events is considered in each bin of the m T distributions. The statistical uncertainty is uncorrelated for each bin of the m T distribution and for each year of data taking. The effect of each uncertainty on the signal yield is summarized in table 4.
The flat experimental uncertainties include a 1.6% uncertainty in the measurement of the integrated luminosity of the data [41,91,92] and a 2.0% uncertainty in the measured trigger efficiency to account for potential kinematic differences between the signal regions and the control region used in the measurement. The remaining experimental uncertainties are shape uncertainties. These include uncertainties in the jet energy corrections and the jet energy resolution, which are evaluated depending on the jet p T and η, and propagated to all jet-related kinematic variables and to the missing transverse momentum. The uncertainty in the pileup reweighting arises from the variation of the total inelastic cross section by 5% [93]. The statistical uncertainty in the muon trigger control region is propagated to the trigger efficiency correction as an additional systematic uncertainty.
The theory uncertainties are generally treated as shape uncertainties. The uncertainty in the PDFs is assessed by reweighting the generated events using the different PDF replicas [61,63]; only the effect on the acceptance is considered. The uncertainty in the renormalization and factorization scales is computed by varying each scale by a factor of two [94][95][96]; as with the PDF uncertainty, only the effect on the acceptance is considered. The uncertainty in the parton shower model is found by similar variations of the renormalization scale used in pythia [97]; the contributions to initial-state radiation (ISR) and final-state radiation (FSR) are considered separately. Because the jet energy corrections are measured for SM jets, whose constituents may be different than semivisible jets, we include an additional uncertainty in the jet energy scale, which is derived by comparing the generatorlevel jet p T to the reconstructed jet p T . The magnitude of the jet energy scale variation is typically 5%, in contrast with the 2% effect found when the same procedure is applied to the simulated electroweak backgrounds. As with the jet uncertainties described above, this effect is propagated to all jet-related variables and to the missing transverse momentum.
The parameters of the background fit function in each region, including the normalization, are freely floating. The uncertainties in these parameters arise from the statistical uncertainty in the data. As mentioned in section 8, the background prediction is found to be sufficiently unbiased that no additional uncertainty is needed. Overall, the largest impact on the sensitivity of the search comes from the background normalization in the high-R T signal region, which can change by up to 10%. This is found to reduce the sensitivity, in terms of the excluded cross section, by a factor of roughly 2-4.  Table 4. The range of effects on the signal yield for each systematic uncertainty and the total. Values less than 0.01% are rounded to 0.0%.

Results
When interpreting the results, we consider the two inclusive signal regions together, and separately consider the two BDT-based signal regions together. The combined likelihood for each pair of signal regions is computed as the product of the likelihood from each bin in the m T distributions. We set limits on the effective cross section σ Z B dark at 95% confidence level (CL) using the modified frequentist CL s approach [98,99]. The systematic uncertainties described above are included in the maximum likelihood fit as nuisance parameters, with the flat uncertainties given log-normal prior distributions and the shape uncertainties given Gaussian prior distributions. This interpretation procedure is tested and validated by repeating the bias testing procedure described in section 8 for each pair of signal regions. The background fit parameters for each signal region are still treated as independent, but the injected and extracted signal cross sections are constrained to be the same in both regions in the pair. As in section 8, we find that the procedure is sufficiently unbiased. The expected limits are derived from a special data set created by replacing the observed data with the central values of the background predictions from the analytic fits when comparing the likelihoods of the background-only and signal-plus-background hypotheses. The likelihood distributions from the special data set and the observed data are computed as a function of the effective cross section and are used to calculate the CL s criterion following the asymptotic approximation [77]. The expected and observed limits are shown in two-dimensional planes of the signal parameters for the inclusive search in figure 7  Γ Z /m Z 10% [48]. To provide a continuous distribution, the limits for simulated signal models are logarithmically interpolated.
The results from the inclusive signal regions exclude observed (expected) values of up to 1.5 < m Z < 4.0 TeV (1.5 < m Z < 4.3 TeV), with the widest exclusion range for models with α dark = α low dark . Depending on the Z mass, 0.07 < r inv < 0.53 (0.06 < r inv < 0.57) and all m dark and α dark variations considered are also observed (expected) to be excluded. These exclusions do not rely strongly on the details of the jet substructure, so they can apply to many signal models that satisfy the final selection via a resonance that produces jets aligned with missing transverse momentum. In particular, because the BDT efficiency depends on Within the excluded range of r inv values, the corresponding mass exclusion is more stringent than the DM mediator interpretation from the inclusive dijet search [37], which targets the Z decay to SM quarks. In the HV model considered here, B dark is similar to the branching fraction for Z → qq , leading the two strategies to have similar sensitivities in regions where they have similar acceptance; for larger B dark values, the relative advantage of the dedicated search strategy would increase. For signal models with very low or very high r inv values, outside of the excluded range, the efficiency of the selection in this search decreases. The dijet resonance search [37] and searches based on missing transverse momentum recoiling against an ISR jet [10] provide better sensitivity for such signal models.

JHEP06(2022)156
The results from the BDT-based signal regions increase the observed (expected) excluded mediator mass range to 1.5 < m Z < 5.1 TeV (1.5 < m Z < 5.1 TeV) for wide ranges of the other signal parameters. The range of observed (expected) excluded r inv values also increases to 0.01 < r inv < 0.77 (0.01 < r inv < 0.78), and the m dark and α dark variations are excluded for a wider range of Z masses. The observed upper limit on the mediator mass decreases compared to the expected limit in some regions of the signal model parameter space because of a small excess with a significance of 1-2 standard deviations around m Z = 3.9 TeV. The improved exclusions from the BDT apply strictly to signal models that produce dark showers with the characteristics described in sections 4 and 7.

Summary
We present the first collider search for resonant production of dark matter from a strongly coupled hidden sector. The search uses proton-proton collision data collected with the CMS detector in 2016-2018, corresponding to an integrated luminosity of 138 fb −1 at a center-of-mass energy of 13 TeV. The signal model introduces a dark sector with multiple flavors of dark quarks that are charged under a dark confining force and form stable and unstable dark hadrons. The stable dark hadrons constitute dark matter candidates, while the unstable dark hadrons decay promptly to standard model (SM) quarks, forming collimated sprays of both invisible and visible particles known as "semivisible" jets. The hidden sector communicates with the SM via a leptophobic Z boson mediator. We consider variations in several parameters of the hidden sector signal model: the Z mass, m Z ; the dark hadron mass, m dark ; the fraction of stable, invisible dark hadrons, r inv ; and the running coupling of the dark force, α dark .
We pursue a dual strategy to maximize both generality and sensitivity. The general version of the search, which uses only event-level kinematic variables, excludes models with 1.5 < m Z < 4.0 TeV and 0.07 < r inv < 0.53 at 95% confidence level (CL), depending on the other signal model parameters. The more sensitive version of the search employs a boosted decision tree (BDT) to discriminate between signal and background jets. With the BDT-based search, the 95% CL exclusion limits extend to 1.5 < m Z < 5.1 TeV and 0.01 < r inv < 0.77, for wider ranges of the other signal model parameters. Depending on the mediator mass, all variations considered for m dark and α dark can be excluded. These improvements indicate the power of machine learning techniques to separate dark sector signals from large SM backgrounds.
These results complement existing searches for dijet resonances and dark matter events with missing momentum and initial-state radiation. Existing strategies did not target strongly coupled hidden sector models, which produce events with jets aligned with moderate missing transverse momentum. Compared to standard dijet searches, the backgrounds are reduced and the resolution is improved by including missing momentum in the event selection and the resonance mass reconstruction, respectively. Events with jets aligned with missing momentum are explicitly rejected from other collider dark matter searches. In addition, the use of jet substructure provides a substantial increase in model-dependent

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid and other centers for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC, the CMS detector, and the supporting computing infrastructure provided by the following funding agencies: BMBWF and FWF (Austria); FNRS and FWO Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.