Phenomenology of a very light scalar (100 MeV $

In this paper we investigate the phenomenology of a very light scalar, $h$, with mass 100 MeV $<m_h<$ 10 GeV, mixing with the SM Higgs. As a benchmark model we take the real singlet scalar extension of the SM. We point out apparently unresolved uncertainties in the branching ratios and lifetime of $h$ in a crucial region of parameter space for LHC phenomenology. Bounds from LEP, meson decays and fixed target experiments are reviewed. We also examine prospects at the LHC. For $m_h \lesssim m_B$ the dominant production mechanism is via meson decay; our main result is the calculation of the differential $p_T$ spectrum of $h$ scalars originating from B mesons and the subsequent prediction of up to thousands of moderate (triggerable) $p_T$ displaced dimuons possibly hiding in the existing dataset at ATLAS/CMS or at LHCb. We also demonstrate that the subdominant $Vh$ production channel has the best sensitivity for $m_h \gtrsim m_B$ and that future bounds in this region could conceivably compete with those of LEP.

where φ ′ 0 is a pure doublet component and ρ is a mixing angle.
In scale invariant models, the Cosmological Constant (CC) is a finite and calculable parameter. Setting it to be small, consistent with observations, leads to non-trivial constraints on the parameters of the theory [6]. Applied to electroweak scale invariant models with a real singlet scalar, the CC constraint implies [4] that the effective couplings Hhh and Hhhh are very small and the mass of h and the angle ρ are correlated: We refer to this throughout as the Foot & Kobakhidze prediction. There are also other quite different motivations for being interested in light scalars. Bezrukov & Gorbunov [7,8] have considered a class of inflationary models which feature a light scalar; constraints from primordial density perturbations imply the relation More generally, some hidden sector (which may or may not contain dark matter) might exist which couples to the singlet scalar. In this case the so-called Higgs portal quartic interaction term then facilitates interactions involving the two sectors. Depending on the mass of the hidden states, invisible decays of h and/or H could be allowed. This occurs, for example, in the recent model of Weinberg motivated by hints of a fractional cosmic neutrino excess [9] (see also Refs. [10] for some phenomenological analyses). Another possibility is that hidden states decay back into SM particles on collider length scales after production, possibly resulting in distinctive signatures such as displaced vertices and/or high multiplicity cascade decays like those seen in hidden valley models [11].
The purpose of this paper is to survey the phenomenological consequences of a very light scalar, with mass 100 MeV < m h < 10 GeV, mixing with the SM Higgs. As a benchmark model we take the real singlet scalar extension of the SM. In this case, h decays only to SM particles with a vertex factor sin ρ compared to the SM Higgs. The production cross section in all channels we consider is proportional to sin 2 ρ, the branching fractions are independent of sin 2 ρ, and the lifetime is inversely proportional to sin 2 ρ: where cτ SM is the mean decay length of a scalar of mass m h with exactly SM Higgs couplings, i.e. h when sin 2 ρ = 1. Our approach is to explore (m h , sin 2 ρ) parameter space, which allows us to test the models of Foot & Kobakhidze and Bezrukov & Gorbunov concurrently. In models where h decays also into invisible exotic states, one may repeat our analysis in the following way: the production cross section is unaffected, the branching fraction to SM final states is altered by a (generally) mass-dependent quantity B SM ≡ Br(h → X SM ), and the lifetime becomes shorter by a factor B SM . One would also need to take into account the branching to invisible states for the invisible searches considered. We take B SM = 1 in our benchmark model and comment on the B SM < 1 case when appropriate.
The paper is organised as follows. In Sec. II we review the properties of our benchmark scalar, discussing some large, apparently unresolved uncertainties in branching fractions and lifetime. In Sec. III we determine existing bounds from LEP, meson decays, and fixed target experiments. In Sec. IV we explore phenomenology and prospects at the LHC. Our main result is the prediction of many inclusive displaced dimuon events for m h m B and the observation that the subdominant V h channel has the best sensitivity for m h m B . We conclude in Sec. V.

II. PROPERTIES
Of interest is the value of Br(h → µ + µ − ) and the mean decay length of h. For sin 2 ρ = 1, h is a hypothetical SM Higgs boson of mass m h . We may therefore appeal to the literature on the SM Higgs before it was ruled out below 2m b [12].
The width to leptons is given by where β l = 1 − 4m 2 l /m 2 h and v ≈ 246 GeV. For m h < 2m µ ≈ 210 MeV, h decays almost entirely to e + e − . Above 2m µ the decay to µ + µ − takes over until the 2m π ≈ 280 MeV threshold, where the ratio R πµ = Γ(h → ππ)/Γ(h → µµ) was historically the subject of much debate [12][13][14][15][16][17][18][19]. In Fig. 1 we reproduce a selection of results to illustrate the large uncertainty in this mass range attributable to resonant ππ enhancements. We note that Ref. [19] is the most recent paper, that we are aware of, that is dedicated to the subject. Above the 2m K ≈ 1 GeV threshold the decay to KK must be taken into account, and has been by a selection of these authors Branching fraction for a light scalar h decaying into muons and its mean decay length for sin 2 ρ = 1 (see Eq. 4) as predicted by a number of models (see text) [12,13,16,18,19]. The Duchovni et al. prediction is an application of the Raby & West result [15]. [17][18][19]. Above the 2m η ≈ 1.1 GeV threshold we know of no reliable prediction. Somewhere above 2 GeV, where the energy involved in the decay is much larger than the typical quark binding energy, the perturbative spectator approach may be utilised [12]: where In Fig. 1 we plot this result alongside that of Ref. [12]. 1 The large uncertainties between 2m π < m h < 4 GeV are apparently unresolved. It would be interesting to know whether a more sophisticated approach is now possible which would provide new insight. A new result would be useful since, in this region, the mean decay length plays an important role in LHC phenomenology.

A. LEP
Constraints from the LEP collider experiment arise from the Bjorken process e + e − → Z → Z * h.
Somewhere not far above m h = 2m µ , prompt searches become relevant. 2 The best constraints are from the LEP1 searches of ALEPH and L3 [22,23]. The 95% C.L. bounds are reproduced in Fig. 2. For reference we also show the bound from the decay-mode independent search of OPAL using the full LEP1+2 dataset [26], which applies to any light scalar regardless of lifetime, branching fractions or exotic decays. These bounds are the best available for m h > (m B − m K ) ≈ 4.8 GeV. They rule out the Foot & Kobakhidze model for m h > 2 GeV.
A short note on these results. L3 considered only hadronic h decays in the hZ * → hνν, he + e − , hµ + µ − channels for m h > 4 GeV. ALEPH used hZ * → hνν, with h → hadronic or h → two/four charged prongs in the region 2m µ < m h < 2m b . Figure 5.5 in Ref. [27] shows that, for m h > 5 GeV, the efficiency of the charged prong search falls and the hadronic search dominates. Therefore, in this region, the LEP limits are unique in that the limit is set by the hadronic decay of h. This is attributable to the comparatively low hadronic background at an e + e − collider and the fact that the hadrons appear as a monojet due to the boost of h when m h 15 GeV. We note that LEP limits for m h < 2m b could have been significantly improved beyond LEP1. The L3 search analysed 114 pb −1 of data; the full LEP dataset is ∼ 3000 pb −1 . With √ s > (m Z + m h ), production of a real Zh pair becomes significant and background falls away [28]. Instead, analyses focused on the search for the SM Higgs above the bb threshold [29]. We can only surmise that, without motivation, this area of parameter space was overlooked.

B. Meson decays
The effectivesdh (bsh) vertex contributing to kaon (B meson) decay is obtained by integrating out the top-W loop from the diagram shown in Fig. 3. This effective vertex leads to the decays K → πh → πµ + µ − and B → Kh → Kµ + µ − , with branchings [30,31] where | p h | is found using two-body kinematics and the form factor F 2 K (m h ) = 1 − m 2 h /38 GeV 2 −1 [32]. In applying experimental constraints from these decays one must properly take into account the lifetime of h; either h decays "promptly enough" so that the muons are reconstructed with the associated meson, or it does not and the experiment sees missing momentum. In the following, we take into account lifetime by requiring h to decay within a certain (experiment-dependent) distance of the meson decay. For simplicity, and because we only expect a small correction, we do not impose any angular constraints. We stress that, where lifetime has an effect, these can only be considered order of magnitude estimates.
As discussed in Sec. II, there is large uncertainty in the lifetime of h above the ππ threshold. We find that the dependence of the following bounds on h lifetime above this threshold is small, and certainly negligible for m h > 400 MeV with the existing experimental reach. We  therefore present results as bounds on sin 2 ρ assuming the model of Ref. [19] below 400 MeV, and unambiguously on sin 2 ρ × Br(h → l + l − ) above, where l corresponds to either µ or τ , depending on the channel.

Kaon decays
The NA48/2 collaboration has measured Br(K ± → π ± µ + µ − ) = (9.62 ± 0.25) × 10 −8 [33], in good agreement with the theoretical predictions (8.7 ± 2.8) × 10 −8 and (12 ± 3) × 10 −8 [34]. To derive limits on sin 2 ρ we assume that a πµµ vertex is reconstructed if the h → µ + µ − decay occurs within the longitudinal vertex resolution, σ z ≈ 100 cm [35], of the kaon decay, and not reconstructed otherwise. A conservative limit on additive new physics is obtained by taking the difference between the low end of SM theoretical predictions, Br(K ± → π ± µ + µ − ) theory 6 × 10 −8 , and the experimental measurement: where the bracketed term is the probability that a particle with lifetime τ , speed βc and boost γ decays within a distance σ z , and γβ ≈ 120 is inherited from the kaon with momentum 60 GeV. Note that both Br(K → πh) and cτ depend on sin 2 ρ, so that this inequality may be used to constrain sin 2 ρ. The obtained constraint is given by the solid blue curve in Fig. 4. The E949 collaboration has published a 90% C.L. upper limit on the two-body decay Br(K ± → π ± X) × Br(X → invisible) that is better than 10 −9 between , and pp → h → µ + µ − via gluon fusion at CMS (magenta). Also shown is the level that dimuon (blue dashed) or ditau (orange dashed) bounds must reach to compete with L3 assuming branching ratios given by the perturbative approach in Sec. II.
170 MeV and 240 MeV [36]. The limit was derived assuming the decay of X was detected and vetoed with 100% efficiency if X decayed within the outer radius of the barrel veto, l BV ≈ 145 cm [37]. We therefore impose the following: where γβ ∼ 1 is determined using two-body kinematics assuming a stationary kaon. This bound applies where h escapes the detector; it also applies to invisibly decaying scalars if B SM < 1. It is shown as the blue dot-dashed line in Fig. 4. Notice that, for m h > 2m µ , this constraint results in a non-trivial excluded region in (m h , sin 2 ρ) parameter space. This is because the invisible yield can fall either by decreasing sin 2 ρ, thereby making the total cross section smaller, or by increasing sin 2 ρ, thereby making the decay more prompt.

B meson decays
The LHCb collaboration has measured Br(B + → K + µ + µ − ) = (4.36 ± 0.15 ± 0.18) × 10 −7 [38], the most accurate measurement to date and in good agreement with the theoretical prediction of (3.5 ± 1.2) × 10 −7 [39]. However, we will use the results from B-factories [40,41], since the nature of an e + e − collider makes it easier to predict the boost factor, and it is convenient to use the same experiment to constrain both the prompt and long-lived case: where the combined visible decay bound is obtained by first adding statistical and systematic uncertainties for each measurement in quadrature and then combining the measurements in the usual way assuming they are independent unbiased estimators of Br(B + → K + µ + µ − ). A conservative limit on additive new physics is obtained by taking the difference between the low end of SM theoretical predictions, Br(B + → K + µ + µ − ) 2.3 × 10 −7 theory , and the experimental measurement: where we follow Ref. [31] in taking l xy ≈ 25 cm as the maximum reconstructed transverse decay distance from the beampipe, and γβ ≈ m B /(2m h ) is dominated by the energy inherited from the B decay in the region m h < 400 MeV. The resulting bounds are shown in red in Figs. 4 and 5. We do not set limits in the invariant mass regions surrounding J/ψ and ψ ′ since the experiments vetoed such muons to remove B → J/ψX, ψ ′ X → µ + µ − X background. The visible B meson decay bound is stronger than the kaon bound since K → πh is CKM-suppressed compared to B → Kh. In the invisible case this suppression is overcome by the O(10 −4 ) stronger bound resulting from a dedicated two-body kaon decay search. These bounds are enough to exclude the Foot & Kobakhidze model for 100 Visible decay bounds could be stronger if dedicated searches in the dimuon invariant mass spectrum were performed. Such a search was carried out in the B 0 → K * 0 X, (K * 0 → K + π − , X → µ + µ − ) channel at Belle in the region 212 MeV< m X < 300 MeV [42]. No excess was found and an upper limit on the branching ratio of O(10 −8 ) was set. Using this upper limit and the expression for Br(B → K * h) in Ref. [31] we derive a limit similarly to Eq. 15. This limit is given by the magenta line in Fig. 4.
LHCb could conceivably improve on the visible B decay bound, though we note that, as bounds reach below the sin 2 ρ < 10 −5 level, special attention needs to be paid to h lifetime. From Fig. 1 the mean decay length for h in the region above the ππ threshold ranges between 10 −9 cm / sin 2 ρ and 10 −5 cm / sin 2 ρ. These mean decay lengths are to be compared with those for B mesons, cτ B ≈ 5 × 10 −2 cm, for which LHCb measures displaced vertices. The lifetime of h is of greater concern at LHCb where much larger boost factors are expected than at Bfactories. We therefore encourage, as did Refs. [8,31,43], a dedicated search for prompt decays covering the whole of the µ + µ − invariant mass range, but we also recommend a displaced search. We will discuss this possibility further in the Sec. IV.
We note in passing that our benchmark particle cannot explain the Σ + → pµ + µ − HyperCP anomaly at m µµ ≈ 214 MeV [44]. Using the Σ → ph width in Ref. [45] we find that to match the measured branching fraction we require sin 2 ρ ≈ 10 −5 -10 −4 . This region is disfavoured by visible B decays and invisible kaon decays. Even so, the lifetime of such a particle along with the expected boost factor of 100-200 suggested by the hyperon momentum gives γβcτ > 13 m, much larger than the longitudinal vertex resolution of about 0.2 m [46]. Additionally, it appears that no value of B SM could resolve the anomaly.

Upsilon decays
Limits also arise from the radiative Υ(nS) → γh decay shown in Fig. 3. The BaBar collaboration has searched in this channel for light bosons decaying to µ + µ − , τ + τ − , hadrons or escaping invisibly [47][48][49][50]. We reproduce the limits from dimuon and ditau decays [47,48] in Fig. 5 in solid blue and solid orange respectively, assuming the QCD correction factor F QCD discussed therein is equal to unity. Ref. [51] discusses limits in light of CLEO data; for masses m h < 2m τ , scalar decays to pions and kaons can be more constraining than decays to muons (see Figure  14 of [51]), though one must keep in mind the significant uncertainties in branching fractions.
B meson decays are easily more constraining for m h (m B − m K ) ≈ 4.8 GeV. For m h 4.8 GeV, ditau limits give the best bound on sin 2 ρ since . Even so, as can be seen from the dashed blue and dashed orange lines in Fig. 5, these bounds do not yet challenge the L3 limit of sin 2 ρ 10 −2 .

C. Fixed Target
Our scalar h can be produced either directly (through gluon fusion) or indirectly (via meson decays) in fixed target experiments. The dominant process depends on √ s and m h . Meson decays dominate in the experiment we will consider below.
Two important regions of parameter space may be identified for indirect production: below the kaon thresh-old, m h < (m K − m π ) ≈ 360 MeV, where kaon decays dominate, and below the B meson threshold, 360 MeV m h m B , where B meson decays dominate. We note that there is a small region where η decays can be important, but D meson decays are sufficiently CKMsuppressed to ignore. Some discussion and analysis may be found in Ref. [7].
As an example, following Ref. [7], we look at the bounds set by the CHARM Collaboration [52]. In this experiment, a 400 GeV proton beam was dumped into a thick copper target ( √ s ≈ 2E p m p ≈ 27.4 GeV) and the decay of a long-lived axion to photons, electrons or muons was searched for in a 35 m long decay region placed 480 m from the target. Zero decays were observed.
The total number of scalars intersecting the solid angle covered by the detector, N h , is related to the number of decays in the decay region, N dec , by where γβm h ∼ 10 GeV, L 1 = L 2 − 35 m = 480 m, and N h ≈ 2.9 × 10 17 × σ h /σ π0 is normalised to the neutral pion yield [52]. We adopt σ π0 ≈ σ pp M pp /3, where M pp is the average hadron multiplicity and σ pp is the protonproton cross section [7]. The h production cross section is dominated by kaon decays: where

and [14]
Since the CHARM experiment observed zero decays, we may constrain N dec at 90% C.L. to be less than 2.3 (the solution of 0.1 = λ k e −λ /k!| k=0 ). Our result is shown in Fig. 4 by the green curve, with the enclosed region being excluded. Observe that scalar masses 100 MeV < m h < 280 MeV are ruled out for the Bezrukov & Gorbunov model by this analysis; the K → π + invisible and CHARM bounds also extend this exclusion substantially below 100 MeV, although it is not shown in Fig. 4.
The reach of the CHARM experiment is testament to the enormous production cross section of mesons in hadron collisions, as well as the exploitation of the long h lifetime to remove all background. These two points, as we will see, are important for LHC phenomenology when m h m B .
Other beam dump experiments exist which may complement the CHARM bound due to, in particular, differing beam energy and detector position [for a partial  . It is beyond the scope of this paper to analyse these experiments in detail. However, we note that it does not appear that any of these experiments has probed the area above the eta meson threshold for h, because of insufficient direct or indirect production at given √ s (see Figure 30 of Ref. [55] for B meson production rates) and/or the distance to the detector being too great. Ideally, high luminosity (and acceptance) fixed target experiments with energy √ s 20 GeV and a detector placed at a distance O(1-10 m) would be needed to probe parameter space below the B decay bound for m h 360 MeV.

IV. LHC
The H → hh channel may be phenomenologically relevant in models with a very light scalar. If allowed, this channel could produce back-to-back pairs of (possibly displaced) dimuons, for which searches have been carried out by ATLAS/CMS [56], or contribute to the Higgs invisible width if B SM < 1. However, the effective couplings Hhh and Hhhh are independent of the parameters m h and sin 2 ρ, i.e. H → hh decay is not necessarily related to the scenario of a very light scalar mixing with the SM Higgs. For example, in the Foot & Kobakhidze model, Hhh and Hhhh effective couplings are suppressed [4]. Since we are focused on mixinginduced effects paramerised by (m h , sin 2 ρ) we do not consider this channel further.
For m h m B , the dominant h production mechanism at the LHC is via the production and decay of mesons. The BB cross section in 7 TeV (8/13 TeV) pp collisions has been calculated as ≈ 2.5 × 10 11 fb (≈ 3/10 × 10 11 fb) [55]. Then, for example, using Eq. 19, at sin 2 ρ = 10 −6 and √ s = 7/8 TeV the h production cross section is ∼ 10 6 fb, to be compared with ∼ 1 fb through gluon fusion. This is also an area of parameter space where h lifetime becomes non-negligible. In the following subsection we determine the differential p T spectrum for scalars originating from B mesons at ATLAS/CMS and LHCb.
We then show that this will result in up to thousands of moderate (triggerable) p T displaced decays in unexplored parameter space using the existing dataset. We note that for m h < m K we also expect production via kaon decays. We ignore this area since, in our benchmark model, it has been explored by CHARM (see Sec. III C). Below the CHARM limit the lifetime becomes long enough so that the majority of moderate p T scalars would escape the detector. The situation may be different in models with B SM < 1, since the lifetime becomes shorter, though one must take into account non-negligible kaon lifetime.
For m h m B , h is dominantly produced in the ways made familiar by the SM Higgs: gluon fusion, vector boson fusion, V h, and tth. Table I shows the production cross sections for an example scalar of mass 5 GeV and sin 2 ρ = 1. Cross sections were obtained using the Hig-gsEffective model in MadGraph/MadEvent5 v1.5.9 [57] equipped with CTEQ6L1 parton distribution functions [58], except in the case of gluon fusion where we used [12] dσ dy where g p (x, Q 2 ) is the gluon distribution function in the proton evaluated at momentum fraction x and scale Q 2 , and we integrated over all possible rapidities y using CTEQ5M parton distribution functions [59]. 3 Gluon fusion is dominant, but V h production is comparable. Such associated production is important from an experimental point of view; trigger limitations and backgrounds affect the gluon fusion channel much more than for V h or tth. In the final subsection we demonstrate that the V h channel is in fact the most sensitive search at the LHC for m h m B .
A. m h mB

Production via B decays
We developed an in-house simulation to calculate the differential cross section dσ h /dp T for scalars from B decays, given dσ B /dp T and dσ B /dy for B mesons in √ s = 7 TeV pp collisions at ATLAS/CMS and at LHCb. It works in the following way: within loops over p B T and y B , there is a loop simulating N dec isotropic B decays to h, which are then boosted from the B frame to the lab frame given p B T and y B , rejected to a rejection bin in a histogram if they fall outside the angular acceptance, or else p h T is measured and we add f (p B T )f (y B )/N dec to 3 MadGraph/MadEvent5 returns a value for gluon fusion of 670 pb in the √ s = 7 TeV case, but breaks at √ s = 13 TeV. dσ/dp the appropriate p T bin in a histogram, where f (p B T ) and f (y) define the discrete probability distributions for the transverse momentum and rapidity of the B meson. The histogram (which should now have unit area) is then normalised to Br(B → h + X) × dσB dpT dp T . We infer f (p B T ) from the fixed-order-next-to-leading-logarithm (FONLL) predictions in Refs. [60][61][62]. This amounts to creating a probability density function by normalising dσ B /dp T to unity over a chosen p T range and then discretising to allow for numerical integration. The dσ B /dp T distributions used are reproduced in Fig. 6. We interpolate f (y) for ATLAS/CMS from the FONLL prediction in Figure 6 of Ref. [61], and for LHCb from the experimental measurements in Figure 4 of Ref. [62]. We make the approximations that f (y) is independent of p T , | p h | in the B frame is equal to that from B → Kh decay, and the decay of the B meson is prompt.
Our results are shown in Fig. 6. Note that we have only considered B + decays; results for B − would be identical, and for B 0 /B 0 would be very similar, so that the total h cross section from B meson decay gains a factor ≈ 4. For larger m h the p T tail falls more slowly because h is produced at lower velocity in the B frame and therefore tends to follow the direction of the B meson. The overall cross section also falls due to kinematic suppression in With the information that is available to us, we are limited to using B mesons within a certain p T range and within rapidities that would be accepted at ATLAS/CMS or at LHCb. These limitations are written in Fig. 6 for clarity. Consequently, values of dσ/dp T in the LHCb case for p T m B are an underestimate, since smaller rapidity B mesons will contribute. Otherwise we believe our results are a very good approximation. Ideally, one would loop over the entire range of allowed B momentum and rapidity using a complete d 2 σ B /dp B T dy B prediction. The point to be gleaned from the distributions in Fig. 6   GeV, cτ ranges between 10 −9 cm / sin 2 ρ and 10 −5 cm / sin 2 ρ, to be compared with ≈ 5×10 −2 cm for B mesons which produce measurably displaced vertices at the LHC. Therefore, we expect a substantial region of parameter space with sin 2 ρ < 10 −5 for which this model predicts many low-background displaced decays. It is this possibility that we pursue presently.

Displaced decays
The precedent for searches for displaced decays of light particles has already been set. ATLAS has performed a search for approximately back-to-back collimated dimuons originating from a 400 MeV particle decaying outside the inner detector but within the muon spectrometer, i.e. with transverse distance from the beamline 1 m L xy 7 m [56]. Prompt muon background is heavily suppressed -there is almost zero background -by requiring a lack of tracks in the inner detector within a cone surrounding the direction of the muon jet. No events are observed in 1.9 fb −1 of data at √ s = 7 TeV.
Such a search might be applied to h to probe H → hh decays. However, motivated by the above analysis, we will consider the signature of an inclusive displaced  muon pair. If we require the decay of h to occur within transverse distance 1 m < L xy < 7 m then we expect a very low background. Making the approximation p ≈ E (β ≈ 1), the probability that a particle of mass m will decay with transverse distance L 1 < L xy < L 2 from the beamline is given by Note here that cτ is inversely proportional to sin 2 ρ as in Eq. 4. As discussed in Sec. II, the lifetime and branching fractions in the 2m π < m h < m B region have a large uncertainty. With this in mind, in the following we evaluate cτ SM using Ref. [19] for m h < 1.4 GeV and the perturbative approach of Eq. 6 otherwise.
To estimate the reach of ATLAS/CMS we require the h (dimuon) transverse momentum to satisfy p h T > 8 GeV. The cross section of displaced h decays with |η h | < 2.4 can then be obtained from Fig. 6 by dσ h dp T dp T .
This will give a slight overestimate (by no more than a factor of 2) for the number of possibly observable displaced decays, since the requirements |η h | < 2.4 and 1 m L xy 7 m are not enough to ensure that the decay occurs within the detector volume. We therefore err on the conservative side by restricting h to the central region |η h | < 1.3. This also ensures that the muons are created before the Level-1 muon trigger at ATLAS. Consequently, our results are an underestimate of the number of decays occuring within the detector volume. The cross sections for three example masses are shown in Fig. 7. As sin 2 ρ gets smaller, the tuning of the mean decay length to cτ ∼ 1-100 cm to maximise P dec plays off against the falling cross section to create a window in unexplored parameter space where the number of displaced decays can be significant.
If one wishes to search for displaced dimuons then the cross section in Fig. 7 must be scaled by 4 × Br(h → µ + µ − ). In Fig. 8 we show contours of the number of expected displaced dimuon events in 20 fb −1 of data at √ s = 7 TeV; this plot serves to indicate the reach of the ATLAS/CMS 8 TeV dataset, which, in the absence of a dσ B /dp T distribution for √ s = 8 TeV pp collisions, we cannot generate the corresponding figure for.
For example, at m h ≈ 500 MeV and sin 2 ρ ≈ 10 −7 , we predict (before efficiency factors) greater than 4 × 10 3 displaced collimated dimuons with p µµ T > 8 GeV. This scenario is consistent with the prediction of Bezrukov & Gorbunov, shown in Fig. 8 as a dashed line. Notice that the area of parameter space which ATLAS/CMS is most sensitive to coincides with this line, meaning that the model may be extensively probed for scalar masses above the existing m h > 280 MeV limit.
In principle, a similar search could be performed at LHCb, although there exists no precedent. In fact, sensitivity is likely to be even better for a few reasons: smaller dimuon transverse momenta (p µµ T ≈ 1 GeV) may be probed, the muon detection system extends to 19 m beyond the interaction point, and the vertex locater, with excellent reconstruction capabilities, might allow for probing of decays closer than 1 m. One drawback however is less integrated luminosity.
A dedicated study incorporating proper acceptance, trigger/reconstruction efficiency and backgrounds is desirable to say more about the reach of the LHC. We have required h to fall within the central region, but this does not guarantee that each muon will have |η µ | < 2.4. At least for lower h masses, where the decay products will be collimated, this is a good assumption. Lower m h is also where we expect the efficiency to be highest, since efficiency falls with muon impact parameter and here the collimated muons will point back along the h direction to the B decay point. SM backgrounds can only arise from neutral particles with lifetimes in the range cτ ∼ 1-100 cm. Of note are K 0 S mesons (cτ K 0 S ≈ 2.7 cm) decaying to pions which may fake muons with m µµ ≈ 500 MeV either through decays-in-flight or punching through the calorimeters; such background appears to be well modelled by Monte-Carlo [63]. Neutral strange baryons Ξ 0 and Λ 0 with masses 1.3 GeV and 1.1 GeV respectively are the only other neutral SM particles with lifetimes in this range; it is not obvious how their decays could fake a µ + µ − vertex. Therefore, at least for m h 500 MeV, the background is expected to be very low so that even a few events, particularly since they will occur at the same dimuon invariant mass, may be significant. If necessary, SM background can always be suppressed by requiring h to decay outside the hadronic calorimeter, 3 m L xy 7 m. In this regime one could also consider complementary signatures of decays to charged objects such as hadrons or τ + τ − that might be picked up by the muon spectrometer. Further analysis is beyond the scope of this paper.
We have shown that with the existing dataset the LHC can (modulo efficiency factors) explore new parameter space by searching for displaced dimuons. Ultimately, knowledge of the exact excluded parameter space region is limited by the uncertainties in lifetime and branching fractions described in Sec. II; we therefore encourage theorists to revisit that problem.
B. m h mB

Inclusive dimuon search
Both ATLAS and CMS have performed a search for a light pseudoscalar, a, produced via gluon fusion and decaying to two muons [64,65]. The CMS search analysed the mass range between 5.5 and 8.8 GeV and between 11.5 and 14 GeV, avoiding the Υ resonances. They provide a 95% C.L. upper limit on σ(pp → a) × Br(a → µ + µ − ).
The production cross section of h through the gluon fusion mechanism is given by Eq. 20. To constrain h we assume that the acceptance in the CMS analysis is the same for our scalar as for the pseudoscalar, and consider only the dominant production of h by gluon fusion. We then apply the σ(pp → a) × Br(a → µ + µ − ) limit, evaluating Eq. 20 by integrating over all possible rapidities using CTEQ5M parton distribution functions [59]. The result is the magenta line shown in Fig. 5. The limit on sin 2 ρ × Br(h → µ + µ − ) competes well with that from upsilon decays, but is far from that of LEP. The 1.3 fb −1 of data analysed by CMS was collected with the opposite-sign dimuon trigger, requiring p µµ T > 6 GeV and m µµ > 5.5 GeV with a prescale factor of 2. These low p T , low invariant mass dimuons are evidently plentiful at the LHC. Thus, as the luminosity and centre-of-mass energy are increased the trigger thresholds and/or the prescale factor must increase. In short, we are background-restricted and trigger-restricted in the region that maximises signal.
So what happens if we demand high dimuon p T , so as to minimise background and avoid trigger-dependence? CMS have performed a search for light resonances in the dimuon spectrum with 35 pb −1 of data collected at √ s = 7 TeV [56]. At m h = 5 GeV, they set a 95% C.L. limit on α ideal × σ(pp → h + X) × Br(h → µ + µ − ) < 0.1 pb, where α ideal is an acceptance factor calculated in your favourite event generator by requiring Using MadGraph/MadEvent5 we found α ideal ≈ 1.1 × 10 −3 for m h = 5 GeV; it is broken down by channel in Table I. For the gluon fusion channel we simulated gg → gh at parton-level, the hard gluon being necessary to give h necessary p T . Interestingly, every channel contributes comparable amounts to the result of Assuming that the bound will scale as ∼ 1/ √ N , with 100 times more data -comparable in size to the CMS pseudoscalar search -we expect a bound of O(10 −2 ). Therefore we have not gained anything on the pseudoscalar search bound by requiring high dimuon p T . This is not surprising, since both the background and the dominant gluon fusion production mechanism have muons recoiling only against initial-state radiation, so that acceptance falls quickly with p µµ T ; this is reflected by the small value of α ideal for gluon fusion in Table I.
This leads us to consider instead triggering on associated activity so that some background is removed and we may probe lower p T muons from the h decay. In the next section, we demonstrate that bounds using the W h channel, triggering on a high p T lepton from the W decay, are potentially stronger than the bounds obtained from an inclusive dimuon search.

Associated search
There are three associated search possibilities: W h, Zh, and tth. In this section we consider the W h → W µ + µ − channel. Because it is in general difficult (and not just for us) to model the combinatoric background, we appeal to the results of experiment. ATLAS has performed a search in 4.6 fb −1 of √ s = 7 TeV data for J/ψ mesons produced in association with a W boson, where both decay muonically [66]. The search amounts to a measurement of the "bump size" in the dimuon invariant mass spectrum around the J/ψ mass of 3.1 GeV; they search in the region 2.5 GeV < m µµ < 3.5 GeV. If h exists in this region we would expect to see a bump above the combinatoric background. We aim to estimate the reach of a W h → µνµ + µ − search using the background distribution therein.
We generate W h (W → µν, h → µ + µ − ) parton-level events in √ s = 7 TeV pp collisions for a scalar of mass 2.7 GeV with SM Higgs couplings using the HiggsEffective model in MadGraph/MadEvent5. We performed the   FIG. 9. The obtained and expected 90% C.L. upper limit on sin 2 ρ × Br(h → µ + µ − ) from the W h channel using 4.6 fb −1 of √ s = 7 TeV data from ATLAS. Variance of the expected limit is statistical only. Also shown is an approximation of the expected limit using the 8 TeV dataset (see text) and the limit from Υ → γh → γµ + µ − decays.
following cuts to match those in Ref. [66]: where the muons are ordered by p T . We subsequently performed the following intermediate state cuts (which made little difference): The results allow us to estimate the number of signal events in 4.6 fb −1 of data as ≈ 1 × 10 4 × sin 2 ρ × Br(h → µ + µ − ). We take the combinatoric background and the number of observed events from Figure 2 of Ref. [66], restricting ourselves to the regions 2.50 GeV < m µµ < 2.94 GeV and 3.28 GeV < m µµ < 3.50 GeV to avoid the J/ψ peak, since the peak is fitted to the data in this region. The signal is modelled as a gaussian with width 50 MeV and mean m h .
Let µ b and µ s be the vectors representing the expected number of background events and the expected number of signal events in k bins. Let y be the data vector. If we normalise µ s to one event, then λµ s represents a signal bump with λ total events. The likelihood of the data is Bayes' theorem relates this likelihood to our degree of belief in λ: where π is the prior distribution for λ. If we take a flat prior, then the 90% C.L. upper limit on λ, λ UL , is found by solving whereL has been normalised so that ∞ 0L (y|λ) = 1. The 90% C.L. upper limit on sin 2 ρ × Br(h → µ + µ − ) is then simply 10 −4 λ UL . We have performed this analysis for a signal centred on each of 36 m h values spread 20 MeV apart.
The obtained upper limit is given by the red line in Fig. 9. An expected (±1σ/ ± 2σ stat.) limit was derived by performing the above analysis on 10 3 pseudodatasets generated assuming the background only hypothesis, ordering them by the obtained λ UL , and taking entry 500 ( 841 159 / 977 023 ), shown by the dashed line and bands in Fig. 9. We also show the expected limit for the case with five times the data, which serves as an approximation for the reach of the 8 TeV dataset. One can see that the limit of O(10 −3 ) is better than that set by radiative upsilon decays. A similar limit would be expected for m h > m B , potentially setting the best LHC limit on sin 2 ρ × Br(h → µ + µ − ) in that region. However, as is evident from Fig. 5, it would still be two orders of magnitude weaker than the L3 limit.
We note that the expected sensitivity of a Zh search, where both the Z and h decay muonically, is expected to be higher because the extra lepton would help to remove combinatoric background. In the future, a search for the production of prompt J/ψ mesons in association with a Z boson may allow the above analysis to be reperformed. The reach of the 13 TeV run is not clear because we do not know the combinatoric background, but one could speculate that more data and higher sensitivity in the Zh channel may be enough to compete with LEP bounds of O(10 −5 ).

V. CONCLUSION
Motivated by scale invariant and inflationary models, we investigated the phenomenology of a very light scalar, h, with mass 100 MeV < m h < 10 GeV, mixing with the SM Higgs. As a benchmark model we took the real singlet scalar extension of the SM and explored (m h , sin 2 ρ) parameter space, where ρ is the mixing angle.
At the LHC we identified two phenomenologically distinct regions of parameter space. For m h m B , h is dominantly produced via the decay of B mesons, with a rate ∼ 10 6 times larger than gluon fusion. In regions of parameter space where h decays promptly, sin 2 ρ 10 −5 , LHCb could set the best limits by searching for resonances in the B → Kµ + µ − dimuon invariant mass spectrum. In the region sin 2 ρ 10 −5 , h lifetime is nonnegligible. We investigated the possibility of searching for displaced dimuons at ATLAS/CMS, showing that, in unexplored parameter space coinciding with the model of Bezrukoz & Gorbunov [7], more than 10 3 signal events (before efficiency factors) could be in the existing 8 TeV dataset (see Fig. 8). By requiring the muons to exhibit no track in the inner detector we expect this search to be almost background-free. This motivates a search for inclusive displaced dimuons at ATLAS/CMS and/or LHCb.
For m h m B we demonstrated that the subdominant V h production channel has the best sensitivity at AT-LAS/CMS. Bounds from the W h channel using 4.6 fb −1 of √ s = 7 TeV data were found to be sin 2 ρ × Br(h → µ + µ − ) 10 −3 in the region 2.5 GeV < m h < 3.5 GeV (see Fig. 9). This limit is stronger than that from upsilon decays, and is expected to extend into the m h > m B region if the analysis was performed. Such a bound would still be about two orders of magnitude weaker than that of LEP1. We expect that the Zh channel would provide better sensitivity and it is conceivable that future LHC bounds could compete with that of LEP1, with the main uncertainty being knowledge of the combinatoric background at √ s = 13 TeV.
In Sec. II we highlighted apparently unresolved uncertainties in the branching ratios and lifetime of h in the region 280 MeV < m h 4 GeV. This is the region that exhibits interesting displaced LHC phenomenology. The most recent paper dedicated to this subject, that we are aware of, is over twenty years old; we therefore recommend that the theory community revisit the problem.
Lastly, we note that a similar analysis could be performed for a very light scalar mixing with the SM Higgs and also decaying to hidden states, by appropriately scaling parameters as described in the Introduction.
Note added: After completion of this paper, Ref. [67] appeared on the arXiv, which has some overlap with Sec. III B.

ACKNOWLEDGMENTS
This work was supported in part by the Australian Research Council. JDC would like to thank Tony Limosani for experimental input on the tth channel (that unfortunately did not end up appearing in this paper) and the suggestion of using Ref. [66] to estimate background in the V h channel, as well as Evgueni Goudzovski for input on the NA48/2 kaon decay bound.