Searches for low-mass dimuon resonances

Searches are performed for low-mass dimuon resonances, $X$, produced in proton-proton collisions at a center-of-mass energy of 13 TeV, using a data sample corresponding to an integrated luminosity of 5.1 fb$^{-1}$ and collected with the LHCb detector. The $X \to \mu^+\mu^-$ decays can be either prompt-like or displaced from the proton-proton collision, where in both cases the requirements placed on the event and the assumptions made about the production mechanisms are kept as minimal as possible. The prompt-like $X$ searches explore the mass range from near the dimuon threshold up to 60 GeV, with nonnegligible $X$ widths considered above 20 GeV. The searches for displaced $X \to \mu^+\mu^-$ decays consider masses up to 3 GeV. None of the searches finds evidence for a signal and 90% confidence-level exclusion limits are placed on the $X \to \mu^+\mu^-$ cross sections, each with minimal model dependence. In addition, these results are used to place world-leading constraints on two-Higgs-doublet and hidden-valley scenarios.


Introduction
Substantial effort has been dedicated [1] to searching for a massive dark photon, A , which obtains a small coupling to the electromagnetic current due to kinetic mixing between the Standard Model (SM) hypercharge and A field strength tensors [2][3][4][5][6][7][8][9]. However, this minimal A model is not the only viable dark-sector scenario. The strongest connection to the dark sector may not arise via kinetic mixing, and the dark sector itself could be populated by additional particles that have phenomenological implications. Searches for dark photons can provide serendipitous discovery potential for other types of particles, especially vector particles that share the same production mechanisms as the minimal dark photon [10], yet many well-motivated models would have avoided detection in all previous experimental searches [11,12]. For example, hidden-valley (HV) scenarios that exhibit confinement produce a high multiplicity of light hidden hadrons from showering processes [13]. These hidden hadrons would typically decay displaced from the protonproton collision, thus failing the criteria employed in Refs. [14,15] to suppress backgrounds due to heavy-flavor quarks [16,17]. Furthermore, the sensitivity to various model scenarios can be improved by exploiting additional signatures, e.g., the presence of a b-quark jet produced in association with the X boson [18]. Therefore, it is desirable to perform searches that are less model dependent, including some that explore additional signatures in the event.
This article presents searches for low-mass dimuon resonances produced in protonproton collisions at a center-of-mass energy of 13 TeV, using a data sample corresponding to an integrated luminosity of 5.1 fb −1 and collected with the LHCb detector in 2016-2018. The X → µ + µ − decays can be either prompt-like, i.e. consistent with being prompt, or displaced from the proton-proton collision. In both cases, the requirements placed on the event and the assumptions made about the production mechanisms are kept as minimal as possible. Two variations of the prompt-like X search are performed: an inclusive version, and an X + b search, where the X boson is required to be produced in association with a beauty quark. Two variations are also considered of the search for displaced X → µ + µ − decays: an inclusive version, and one where the X boson is required to be produced promptly in the proton-proton collision. The prompt-like X searches explore the mass range from near the dimuon threshold up to 60 GeV (natural units with c = 1 are implied throughout this article), with nonnegligible widths, Γ(X), considered above 20 GeV. The searches for displaced X → µ + µ − decays consider masses up to 3 GeV. This analysis uses the same data sample as the LHCb minimal dark-photon search [15]; however, the searches presented here are less sensitive to the A model, since the fiducial regions and selection criteria are not optimized for that scenario.
The fiducial regions used for each search, defined in Table 1, ensure that the detector response is sufficiently model independent in the kinematic regions where results are reported. The requirements placed on the momenta, p, and transverse momenta, p T , of the muons make them sufficiently energetic to be selected by the trigger, but not so energetic that their charges cannot be determined. Only events with at least one reconstructed proton-proton primary vertex (PV) are used in the analysis, which requires that at least five charged prompt-like particles, including the muons of the X decay if this is prompt-like, are produced in the same collision as the X boson. An upper limit is also placed on the number of charged particles produced in the collision, since the detector response depends on the charged-particle multiplicity. The dimuon opening p T (µ) > 0.5 GeV 10 < p(µ) < 1000 GeV All searches 2 < η(µ) < 4.5 p T (µ + )p T (µ − ) > 1 GeV 5 ≤ n charged (2 < η < 4.5, p > 5 GeV) < 100 (from same PV as X) angle is required to be α(µ + µ − ) > 1 (3) mrad in the searches for prompt-like (displaced) X → µ + µ − decays to ensure that the reconstruction efficiency factorizes into the product of the two individual muon efficiencies, which subsequently leads to an upper limit on p T (X) to remove regions where the α(µ + µ − ) requirement is rarely satisfied. The X + b analysis is performed using jets clustered with the anti-k T algorithm [19] using a distance parameter R = 0.5. The jets are required to have 20 < p T (jet) < 100 GeV and a pseudorapidity in the range 2.2 < η(jet) < 4.2 so that the b-tagging efficiency is nearly uniform within the fiducial region. Finally, the displaced X → µ + µ − secondary vertex (SV) is required to be transversely displaced from the PV in the range 12 < ρ T < 30 mm, which results in minimal dependence on the SV location distribution. For example, this requirement leads to the efficiency being nearly independent of the X lifetime, τ (X); however, the probability that the X boson decays in this region is strongly dependent on τ (X).
This article is structured as follows. The LHCb detector, trigger, and simulation are described in Sec. 2, while the offline selections used in each of the searches are discussed in Sec. 3. Section 4 presents the searches for both prompt-like and displaced X → µ + µ − decays. Section 5 discusses the efficiencies and luminosity. The model-independent crosssection results, along with their interpretations within the context of specific models, are described in Sec. 6. Section 7 provides a summary and discussion of all results.

Detector and simulation
The LHCb detector [20,21] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the proton-proton interaction region (VELO), a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet. The tracking system provides a measurement of the momentum of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV. The minimum distance of a track to a PV, the impact parameter, is measured with a resolution of (15 + 29/p T ) µm, where p T is in GeV. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers.
The online event selection is performed by a trigger, which consists of a hardware stage followed by a two-level software stage. In between the two software stages, an alignment and calibration of the detector is performed in near real-time and their results are used in the trigger [22]. The same alignment and calibration information is propagated to the offline reconstruction, ensuring consistent and high-quality particle identification information between the trigger and offline software. The identical performance of the online and offline reconstruction offers the opportunity to perform physics analyses directly using candidates reconstructed in the trigger [23,24], which the prompt-like X → µ + µ − searches exploit.
At the hardware trigger stage, events are required to have a dimuon pair with p T (µ + )p T (µ − ) (1.5 GeV) 2 and at most 900 hits in the scintillating-pad detector, which prevents high-occupancy events from dominating the processing time in the software trigger stages. The latter requirement is the main motivation for defining the maximum charged-particle multiplicity in Table 1. In the software stage, where the p T resolution is substantially improved compared to the hardware stage, X → µ + µ − candidates are built from two oppositely charged tracks that form a good-quality vertex and satisfy stringent muon-identification criteria. All searches require p T (X) > 1 GeV and 2 < η(µ) < 4.5. The prompt-like X → µ + µ − searches use muons that are consistent with originating from the PV, with p T (µ) > 1.0 GeV and momentum p(µ) > 20 GeV in the 2016 data sample, and p T (µ) > 0.5 GeV, p(µ) > 10 GeV, and p T (µ + )p T (µ − ) > (1.0 GeV) 2 in 2017-2018. The searches for displaced X → µ + µ − decays use muons with p T (µ) > 0.5 GeV and p(µ) > 10 GeV that are inconsistent with originating from any PV, and require 2 < η(X) < 4.5. In addition, the search for a long-lived promptly produced X boson requires a decay topology consistent with a dimuon resonance originating from a PV.
Simulation is required to model the effects of the detector acceptance and its response to X → µ + µ − decays. In the simulation, pp collisions are generated using Pythia [25] with a specific LHCb configuration [26]. Decays of unstable particles are described by EvtGen [27], in which final-state radiation is generated using Photos [28]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [29] as described in Ref. [30]. Simulation is also used to place constraints on specific models. Prompt limits for light-pseudoscalar models are set with next-to-next-to-leading order cross-sections from Higlu [31] using the Nnpdf3.0 PDF set [32], branching fractions from Hdecay [33], and fiducial acceptances from Pythia [34]. Displaced limits for HV models are set with Pythia [34] using a running α HV scheme [35], and couplings from Darkcast [10].

Selection
The selection criteria are largely applied online in the trigger and most are the same as those used in the LHCb minimal dark-photon search [15]. The prompt-like dimuon sample selected by the trigger described in Sec. 2 predominantly consists of genuine prompt dimuon pairs. The only selection criteria applied offline for the inclusive prompt-like X → µ + µ − search, p T (X) < 50 GeV and α(µ + µ − ) > 1 mrad, are included in the definition of the fiducial region. In addition to these, the search for a prompt-like X boson produced in association with a beauty quark requires at least one b-tagged jet with p T (jet) > 20 GeV and 2.2 < η(jet) < 4.2. The jets are formed by clustering charged and neutral particle-flow candidates [36] using the anti-k T clustering algorithm as implemented in FastJet [37]. The b-tagging requires an SV in the jet that satisfies the criteria given in Ref. [38]. Figure 1 shows the m(µ + µ − ) distributions of both prompt-like data samples in bins of width σ[m(µ + µ − )]/2, where σ[m(µ + µ − )] denotes the dimuon invariant-mass resolution which varies from 0.6 MeV near threshold to 0.6 GeV at m(µ + µ − ) = 60 GeV. In the searches for displaced X → µ + µ − decays, contamination from prompt particles is negligible due to a stringent criterion applied in the trigger that requires muons to be inconsistent with originating from any PV. Furthermore, the fiducial region requires a transverse displacement from the PV of 12 < ρ T < 30 mm, which is applied offline in both searches for displaced X → µ + µ − decays and highly suppresses the background from b-hadron decay chains that produce two muons. Therefore, the dominant background contributions are due to material interactions in the VELO, e.g. photons that convert into µ + µ − pairs, and from K 0 S → π + π − decays, where both pions are misidentified as muons, which is the dominant background in the search for K 0 S → µ + µ − decays [39]. A p-value is assigned to the material-interaction hypothesis for each displaced X → µ + µ − candidate using properties of the SV and muon tracks, along with a high-precision three-dimensional material map produced from a data sample of secondary hadronic interactions [40]. The same mass-dependent requirement used in Ref. [15] is applied to the p-values in this analysis, which highly suppresses the material-interaction background. Figure 2 shows the m(µ + µ − ) distributions of both displaced-dimuon data samples.  Figure 2: Displaced dimuon mass spectra showing the (black) inclusive and (red) promptly produced samples. The grey box shows the region vetoed due to the large doubly misidentified K 0 S background, whose low-mass tail extends into the search region.

Signal searches
The signal-search strategies and methods employed are similar to those used in Ref. [15]. The dimuon mass spectra are scanned in around 6000 steps of about σ[m(µ + µ − )]/2 searching for X → µ + µ − contributions. For m(X) < 20 GeV, the data are binned in p T (X) and each p T bin is searched independently for each m(X) hypothesis; whereas at higher masses, p T bins are not necessary since both the resolution and efficiency are nearly independent of p T (X). All searches use the profile likelihood method to determine the local p-values and the confidence intervals on the signal yields. The trial factors are obtained using pseudoexperiments in each search. The confidence intervals are defined using the bounded likelihood approach [41], which involves taking ∆ log L relative to zero signal, rather than the best-fit value, if the best-fit signal value is negative. This approach enforces that only physical (nonnegative) upper limits are placed on the signal yields, and prevents defining exclusion regions that are much better than the experimental sensitivity in cases where a large deficit in the background yield is observed. The signal m(µ + µ − ) distributions are well modeled by a Gaussian function, whose resolution is determined with 10% precision using a combination of simulated X → µ + µ − decays and the observed p T -dependent widths of the large known resonance peaks present in the data. The mass-resolution uncertainty is included in the profile likelihood. The fit strategy used in the prompt-like X → µ + µ − searches below 20 GeV, which is the same as in Refs. [14,15], was first introduced in Ref. [42]. At each m(X) hypothesis, a binned extended maximum-likelihood fit is performed in a ±12.5 σ[m(µ + µ − )] window around the m(X) value. Near the dimuon threshold, the energy released in the decay, 2 , is used instead of the mass because it is easier to model. The background model for each fit window takes as input a large set of potential components, then the data-driven model-selection process of Ref. [42] is performed, whose uncertainty is included in the profile likelihood following Ref. [43]. Specifically, the method labeled aic-o in Ref. [42] is used, where the log-likelihood of each background model is penalized for its complexity (number of parameters). The confidence intervals are obtained from the profile likelihoods, including the penalty terms, where the model index is treated as a discrete nuisance parameter, as originally proposed in Ref. [43]. In the X + b search there are not many candidates near the dimuon threshold. Therefore, just in this region, the counting-experiment-based method of Ref. [44] is used, which is also used in the searches for displaced X → µ + µ − decays and described in detail below.
In this analysis, the set of possible background components is the same as in Ref. [15] and includes all Legendre modes up to tenth order at every m(X). Additionally, dedicated background components are included for sizable narrow SM resonance contributions. The use of 11 Legendre modes adequately describes every doubly misidentified peaking background that contributes at a significant level; therefore, these do not require dedicated background components. In mass regions where such complexity is not required, the datadriven model-selection procedure reduces the complexity, which increases the sensitivity to a potential signal contribution. Therefore, the impact of the background-model uncertainty on the size of the confidence intervals is mass dependent, though on average it is about 30%. As in Ref. [42], all fit regions are transformed onto the interval [−1, 1], where the m(X) value is mapped to zero. After such a transformation, the signal model is (approximately) an even function; therefore, odd Legendre modes are orthogonal to the signal component, which means that the presence of odd modes has minimal impact on the variance of the observed signal yield. In the prompt-like fits, all odd Legendre modes up to ninth order are included in every background model, while even modes must be selected for inclusion in each fit by the data-driven method of Ref. [42].
Regions in the mass spectrum with large SM resonance contributions are vetoed in the searches for prompt-like X → µ + µ − decays. Furthermore, the region near the η meson is treated uniquely. Since it is not possible to distinguish between X → µ + µ − and possible η → µ + µ − contributions at m(η ), the p-values near this mass are ignored. The small observed excess at m(η ) is simply absorbed into the signal yield when setting the limits, which is conservative in that the η → µ + µ − contribution weakens the constraints on X → µ + µ − decays. Figure 3 shows the signed local significances for all m(X) below 20 GeV for both prompt-like X → µ + µ − searches. The largest local excess in the inclusive search in this mass region is 3.7σ at 349 MeV in the 3 < p T (X) < 5 GeV bin; however, its neighboring p T bin at this mass has a small deficit and the global significance is only ≈ 1σ. Similarly, the largest local excess in the X + b search below 20 GeV is 3.1σ at 2424 MeV in the 10 < p T (X) < 20 GeV bin, though again, the neighboring p T bins both have deficits at the same mass, and the global significance is below 1σ. Therefore, no significant excess is found in either prompt-like spectrum for m(X) < 20 GeV.
In the 20 < m(X) < 60 GeV region, the background is nearly monotonic, which permits the use of a simplified fit strategy. The entire 12 < m(µ + µ − ) < 80 GeV region is fitted when considering all m(X) values above 20 GeV. The background model is comprised of three falling power-law terms and an eighth-order polynomial that collectively describe the Drell-Yan, heavy-flavor, and misidentified-background contributions, along with a rising power-law term to describe the low-mass tail of the Z boson, where all parameters are free to vary. This background model is validated by studying simulated Drell-Yan dimuon production, same-sign dimuon data which predominantly consists of heavy-flavor and misidentification backgrounds, and candidates in the data sample itself above the search region. Unlike at lower masses, nonnegligible widths are considered. At each m(X), a scan is performed covering the range 0 ≤ Γ(X) ≤ 3 GeV. The signals are modelled by a Gaussian resolution function convolved with the modulus of a Breit-Wigner function.  associated beauty prompt-like X → µ + µ − searches. If the best-fit signal-yield estimator is negative, the signed significance is negative and vice versa. The grey regions are excluded either due to a nearby large QCD resonance contribution, or because the overlap of the bin with the fiducial region in Table 1 is small. Figure 4 shows the signed local significances for the m(X) > 20 GeV region for both prompt-like X → µ + µ − searches. The largest local excess in the inclusive search in this mass region is 3.2σ at m(X) = 36 GeV for Γ(X) = 1.5 GeV, which corresponds to a global p-value of about 11% (considering only the m(X) > 20 GeV mass region). In the X + b search, no local significance exceeds ≈ 2σ in this mass region. Therefore, no significant excess is found in either prompt-like spectrum for m(X) > 20 GeV.
Motivated by the possible excess seen by CMS [45] in X + bb events, a dedicated search for a resonance with 27 < m(X) < 30 GeV and 0.5 < Γ(X) < 3.0 GeV is performed in the subset of the X +b data sample that contains at least two b-tagged jets. The mass spectrum in the range 20-40 GeV is fitted using a model consisting of a second-order polynomial background and a signal whose mass and width are free to vary within the m(X) and Γ(X) ranges specified above. Figure 5 shows the result of this fit. The best-fit signal yield is negative in the region considered; therefore, no evidence for a signal is observed. Using the efficiency and luminosity from Sec. 5, and their associated uncertainties, the upper limits on the X(µ + µ − ) + bb cross section in the m(X) and Γ(X) regions considered are no larger than 15 fb × Γ(X)/ GeV.
The fit strategy used in the searches for displaced X → µ + µ − decays below the K 0 S mass is also the same as in Refs. [14,15]. Binned extended maximum-likelihood fits are   performed to the Q spectrum in each p T bin. The region near the K 0 S mass is vetoed to avoid the sizable background from doubly misidentified K 0 S → π + π − decays. The expected photon-conversion contribution is derived from a sample of candidates that are consistent with a photon originating from a PV. Two large control samples are used to develop and validate the modeling of the K 0 S and remaining material-interaction contributions: dimuon candidates that fail, but nearly satisfy, the stringent muon-identification criteria; and a sample of dimuon candidates that is rejected by the material-interaction criterion. Both contributions are well modeled by second-order polynomials in Q below the K 0 S veto region. The material-interaction contribution, apart from the dedicated photon-conversion component, is not needed in the search that requires a decay topology consistent with an X boson originating from a PV.
The fit strategy used in the searches for displaced X → µ + µ − decays above the K 0 S veto region, specifically, in the 0.5 < m(X) < 3.0 GeV mass range, is the same as used in the LHCb search for hidden-sector bosons produced in B 0 → K ( * ) X(µ + µ − ) decays [46,47]. This strategy was first introduced in Ref. [44]. Since no sharp features are expected in the background in this region, and due to the small bin occupancies, the background is estimated by interpolating the yields in the sidebands starting at ±3σ[m(µ + µ − )] from m(X). The statistical test at each mass is based on the profile likelihood ratio of Poissonprocess hypotheses with and without a signal contribution. The uncertainty on the background interpolation is modeled by a Gaussian term in the likelihood. Figure 6 shows the signed local significances for both searches for displaced X → µ + µ − decays. The largest local excess in the search for a promptly produced long-lived X boson is 2.8σ, which occurs at 280 MeV in the 2 < p T (X) < 3 GeV bin. The largest local excess in the inclusive search for displaced X → µ + µ − decays is 3.1σ at 604 MeV in the 3 < p T (X) < 5 GeV bin. Both of these correspond to global excesses below 1σ; therefore, no significant excess is found in either search for displaced X → µ + µ − decays.

Efficiency and luminosity
The X → µ + µ − yields are corrected for detection efficiency, which is determined as the product of the trigger, reconstruction, and selection efficiencies. The trigger efficiency is measured as a function of p T (µ + )p T (µ − ) using a displaced J/ψ calibration sample. Events selected by the hardware trigger independently of the J/ψ candidate, e.g. due to the presence of a high-p T hadron, are used to determine the trigger efficiency directly from the data. The muon reconstruction efficiency is obtained from simulation in bins of [p(µ), η(µ)]. Scale factors that correct for discrepancies between the data and simulation are determined using a data-driven tag-and-probe approach on an independent sample of J/ψ → µ + µ − decays [48]. The contribution to the selection efficiency from the muon-identification performance is measured in bins of [p T (µ), η(µ)] using a highly pure calibration sample of J/ψ → µ + µ − decays. Finally, the contributions from the vertex-quality and prompt-like muon criteria are determined from simulation, and validated using a calibration sample of prompt QCD resonance decays to the µ + µ − final state.
The uncertainty due to the methods used to determine each of these components of the total efficiency is assessed by repeating the data-based efficiency studies on simulated events, where the difference between the true and efficiency-corrected yields in kinematic bins is used to determine the systematic uncertainty. These uncertainties are in the 2-5% range, depending on X-boson kinematics. Additional uncertainties arise due to the unknown production mechanisms of the X bosons. The muon reconstruction and identification efficiencies depend on the charged-particle multiplicity. The corresponding systematic uncertainty is determined to be 5%, which covers both minimal and maximal charged-particle multiplicities defined in Table 1 at the 2σ level. The unknown kinematic distributions in both p T and η within the wide p T bins used in the analysis lead to sizable uncertainties. The variation in the efficiencies across the kinematic regions allowed in each bin are used to determine bin-dependent uncertainties that vary from 10 to 30%.
The X + b analysis uses the SV-based b-tagging method described in detail in Ref. [38], though without placing any criteria on the boosted decision tree algorithms. The b-tagging efficiency is estimated to be (65 ± 7)%, where the uncertainty covers both the variation of the b-tagging efficiency across the b-jet fiducial region and possible data-simulation discrepancies. An additional uncertainty arises since the efficiency for a b-tagged jet in the fiducial region to be reconstructed with p T > 20 GeV depends on the unknown underlying jet p T spectrum. The detector response to jets is studied using the p T -balance distribution of p T (jet)/p T (Z) in nearly back-to-back Z-boson+jet events using the same data-driven technique as in Ref. [36]. Based on this study, and considering jet p T spectra as soft as QCD di-b-jet production and hard enough to result in negligible inefficiency, this efficiency is estimated to be (90 ± 5)%.
The searches for displaced X → µ + µ − decays must also account for effects that arise due to the displacement of the SV from the PV. The relative efficiency of displaced compared to prompt-like dimuon production is obtained as a function of m(X) and p T (X) by resampling prompt X → µ + µ − candidates as displaced X → µ + µ − decays, where all displacement-dependent properties are recalculated based on the resampled SV locations. The high-precision material map produced in Ref. [40] forms the basis of the materialinteraction criterion applied in the selection. This map is used to determine where each muon would hit active sensors, and thus, have recorded hits in the VELO. The resolution on the vertex location and other displacement-dependent properties varies strongly with the location of the first VELO hit on each muon track, though this dependence is largely geometric, making rescaling the resolution of prompt tracks straightforward. This approach is validated using simulation, where prompt X → µ + µ − decays are used to predict the properties of long-lived X → µ + µ − decays; these predictions are found to agree within 2% with the actual values. The efficiencies at both short and long distances, which are driven by the muon displacement criterion and the minimum number of VELO hits required to form a track, respectively, are well described. The dominant uncertainty, which arises due to limited knowledge of how radiation damage has affected the VELO performance, is estimated to be 5% by rerunning the resampling method under different radiation-damage hypotheses.
The efficiency of the material-interaction criterion is validated separately using two control samples. The predicted efficiency for an X boson with the same mass and lifetime as the K 0 S meson is compared to the efficiency observed in a control sample of K 0 S decays. The predicted and observed efficiencies agree to 1%. Additionally, in Ref. [40] the expected performance of the material-interaction criterion was shown to agree with the performance observed in a control sample of photon conversions to the O(10 −4 ) level. Finally, the distribution of the SV locations is unknown, which leads to a 10% uncertainty in the efficiency determined by comparing the efficiency of an X boson that rarely survives long enough to enter the decay fiducial region to an extremely long-lived X boson.
Most of the data used in this analysis is from data-taking periods that do not yet have fully calibrated luminosities. Therefore, the efficiency-corrected yield of Z/γ * → µ + µ − decays observed in the data sample-and the corresponding high-precision LHCb crosssection measurement made using 2015 data [49]-are used to infer the luminosity. A small correction factor is obtained from Pythia 8 to account for the different fiducial regions. This luminosity determination is validated by also determining the Υ (1S) differential cross section from this data sample and comparing the results to those published by LHCb using the 2015 data sample [50]. The different fiducial region is again corrected for using a scale factor obtained from Pythia 8. The results are found to agree to ≈ 5% in each p T bin, which is assigned as a systematic uncertainty and combined with the 4% luminosity uncertainty from Ref. [49] to obtain the total uncertainty on the luminosity of this data sample. Based on both of these studies, the luminosity is determined to be 5.1 ± 0.3 fb −1 . The minimal dark-photon search [15], which used the same data sample but did not require knowledge of the luminosity, quotes an uncalibrated luminosity value that is 7% larger. The efficiency corrections used to infer the luminosity are highly correlated to those used to correct the observed X → µ + µ − yields, which is accounted for when determining the total normalization uncertainties.

Cross-section results
The upper limits on the signal yields obtained in Sec. 4 are normalized using the efficiencies and luminosity described in Sec. 5. The systematic uncertainties on the signal yield, efficiency, and luminosity are included in the profile likelihood when determining the cross-section upper limits. These uncertainties are described in detail in Secs. 4 and 5, and summarized in Table 2. The resulting upper limits at 90% confidence level on σ(X → µ + µ − ) for all searches are shown in Figs. 7-9, and provided numerically in Ref. [51].
The model-independent limits in Figs. 7-8 can be used to place constraints on any  model that would produce a prompt-like low-mass dimuon resonance within the fiducial region of Table 1. For example, models where a complex scalar singlet is added to the two-Higgs doublet (2HDM) potential often feature a light pseudoscalar boson that can decay into the dimuon final state; see, e.g., Ref. [18]. References [52,53] considered the scenario where the pseudoscalar boson acquires all of its couplings to SM fermions through its mixing with the Higgs doublets; the corresponding X-H mixing angle is denoted as θ H . Figure 10 shows that world-leading constraints are placed on θ H by the prompt-like σ(X → µ + µ − ) limits shown in Figs. 7-8. Furthermore, assuming the X + bb topology produced by this type of model permits direct comparison with the excess seen by CMS in this final state [45]. For this scenario, the X + b limits from Fig. 8 are about 20 times lower than the excess observed by CMS.
The limits on displaced X → µ + µ − decays in Fig. 9 can also be used to place constraints on specific models. One example is HV scenarios that exhibit confinement, which result in a large multiplicity of light hidden hadrons from showering processes [13]. These hidden hadrons typically have low p T and decay displaced from the proton-proton collision. Figure 11 shows the limits placed on this type of HV scenario by the search for displaced X → µ + µ − decays. These are the most stringent constraints to date. Specifically, constraints are placed on the kinetic-mixing strength between the photon and a heavy HV boson, Z HV , with photon-like couplings. The kinematics of the hidden hadrons depend upon the average HV hadron multiplicity, N HV , and are largely independent of the model parameter space. In Fig. 11 N HV is fixed at ≈ 10 for all hidden hadron masses. These are the first results that constrain the kinetic-mixing strength to be less than unity in this mass region.  Figure 10: Upper limits at 90% confidence level on the X-H mixing angle, θ H , for the 2HDM scenario discussed in the text (blue) from this analysis compared with existing limits from (red) BaBar [54], (green) CMS Run 1 [55], (magenta) CMS Run 2 [56] and (yellow) LHCb Run 1 [57].

Summary
In summary, searches are performed for low-mass dimuon resonances produced in protonproton collisions at a center-of-mass energy of 13 TeV using a data sample corresponding to an integrated luminosity of 5.1 fb −1 collected with the LHCb detector. The X → µ + µ − decays can be either prompt-like or displaced from the proton-proton collision, where in both cases the requirements placed on the event and the assumptions made about the production mechanisms are kept as minimal as possible. Two variations of the prompt-like X search are performed: an inclusive version, and one where the X boson is required to be produced in association with a beauty quark. Two variations are also considered of the search for displaced X → µ + µ − decays: an inclusive version, and one where the X boson is required to be produced promptly in the proton-proton collision. The prompt-like X searches explore the mass range from near the dimuon threshold up to 60 GeV, with nonnegligible X widths considered above 20 GeV. The searches for displaced X → µ + µ − decays consider masses up to 3 GeV. None of the searches finds evidence for a signal, and 90% confidence-level exclusion limits are placed on the X → µ + µ − cross sections, each with minimal model dependence.