1 Introduction

The decay \({{B} ^+} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) receives contributions from short-distance \({{b}} \!\rightarrow {{s}} {\ell ^+} {\ell ^-} \) flavour-changing neutral-current (FCNC) transitions and long-distance contributions from intermediate hadronic resonances. In the Standard Model (SM), FCNC transitions are forbidden at tree level and must occur via a loop-level process. In many extensions of the SM, new particles can contribute to the amplitude of the \({{b}} \!\rightarrow {{s}} {\ell ^+} {\ell ^-} \) process changing the rate of the decay or the distribution of the final-state particles. Decays like \({{B} ^+} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) are therefore sensitive probes of physics beyond the SM.

Recent global analyses of measurements involving \({{b}} \!\rightarrow {{s}} {\ell ^+} {\ell ^-} \) processes report deviations from SM predictions at the level of four standard deviations [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]. These differences could be explained by new short-distance contributions from non-SM particles [1,2,3,4,5, 12, 16] or could indicate a problem with existing SM predictions [13, 15, 17]. To explain the observed tensions, long-distance effects would need to be sizeable in dimuon mass regions far from the pole masses of the resonances. Therefore, it is important to understand how well these long-distance effects are modelled in the SM and how they interfere with the short-distance contributions. Previous measurements of \({{b}} \!\rightarrow {{s}} {\ell ^+} {\ell ^-} \) processes [18,19,20,21,22,23] excluded regions of dimuon mass around the \(\phi \), \({{J}/\psi }\) and \(\psi {(2S)}\) resonances. The amplitude in these mass regions is dominated by the narrow vector resonances and has a large theoretical uncertainty. These dimuon regions are therefore considered insensitive to new physics effects.

In this paper, a first measurement of the phase difference between the contributions to the short-distance and the narrow-resonance amplitudes in the \({{B} ^+} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) decay is presented.Footnote 1 For the first time, the branching fraction of the short-distance component is determined without interpolation across the \({{J}/\psi }\) and \(\psi {(2S)}\) regions. The measurement is performed through a fit to the full dimuon mass spectrum, \(m_{\mu \mu }\), using a model describing the vector resonances as a sum of relativistic Breit–Wigner amplitudes. This approach is similar to that of Refs. [13, 24], with the difference that the magnitudes and phases of the resonant amplitudes are determined using the LHCb data rather than using the external information on the cross-section for \({{e} ^+} {{e} ^-} \!\rightarrow \mathrm{hadrons}\) from the BES collaboration [25]. The model includes the \(\rho \), \(\omega \), \(\phi \), \({{J}/\psi }\) and \(\psi {(2S)}\) resonances, as well as broad charmonium states (\(\psi (3770)\), \(\psi (4040)\), \(\psi (4160)\) and \(\psi (4415)\)) above the open charm threshold. Evidence for the \(\psi (4160)\) resonance in the dimuon spectrum of \({{B} ^+} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) decays has been previously reported by LHCb in Ref. [26]. The continuum of broad states with pole masses above the maximum \(m_{\mu \mu }\) value allowed in the decay is neglected.

The measurement presented in this paper is performed using a data set corresponding to 3\(\mathrm{\,fb}^{-1}\) of integrated luminosity collected by the LHCb experiment in pp collisions during 2011 and 2012 at \(\sqrt{s}\) = 7 TeV and 8 TeV . The paper is organised as follows: Section 2 describes the LHCb detector and the procedure used to generate simulated events; the reconstruction and selection of \({{B} ^+} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) decays are described in Sect. 3; Section 4 describes the \(m_{\mu \mu }\) distribution of \({{B} ^+} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) decays, including the model for the various resonances appearing in the dimuon mass spectrum; the fit procedure to the dimuon mass spectrum, including the methods to correct for the detection and selection biases, is discussed in Sect. 5. The results and associated systematic uncertainties are discussed in Sects. 6 and 7. Finally, conclusions are presented in Sect. 8.

2 Detector and simulation

The LHCb detector [27, 28] is a single-arm forward spectrometer, covering the pseudorapidity range \(2<\eta <5\), designed to study the production and decay of particles containing \({b} \) or \({c} \) quarks. The detector includes a high-precision tracking system divided into three subsystems: a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector that is located upstream of a dipole magnet with a bending power of about \(4{\mathrm {\,Tm}}\), and three stations of silicon-strip detectors and straw drift tubes situated downstream of the magnet. The tracking system provides a measurement of the momentum, \(p\), of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200\({\mathrm {\,GeV/}c}\). The momentum scale of tracks in the data is calibrated using the \({{B} ^+} \) and \({{J}/\psi }\) masses measured in \({{{B} ^+}} \!\rightarrow {{{J}/\psi }} {{{K}} ^+} \) decays [29]. The minimum distance of a track to a primary vertex (PV), the impact parameter (IP), is measured with a resolution of \((15+29/p_{\mathrm { T}}){\,\upmu \mathrm {m}} \), where \(p_{\mathrm { T}}\) is the component of the momentum transverse to the beam, in \({\mathrm {\,GeV/}c}\). Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors (RICH). Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. The online event selection is performed by a trigger [30], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction.

A large sample of simulated events is used to determine the effect of the detector geometry, trigger, and selection criteria on the dimuon mass distribution of the \({{B} ^+} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) decay. In the simulation, pp collisions are generated using Pythia 8 [31, 32] with a specific LHCb configuration [33]. The decay of the \({{B} ^+} \) meson is described by EvtGen  [34], which generates final-state radiation using Photos  [35]. As described in Ref. [36], the Geant4 toolkit [37, 38] is used to implement the interaction of the generated particles with the detector and its response. Data-driven corrections are applied to the simulation following the procedure of Ref. [23]. These corrections account for the small level of mismodelling of the detector occupancy, the \({{{B} ^+}} \) momentum and vertex quality, and the particle identification (PID) performance. The momentum of every reconstructed track in the simulation is also smeared by a small amount in order to better match the mass resolution of the data.

3 Selection of signal candidates

In the trigger for the 7 TeV (8 TeV ) data, at least one of the muons is required to have \(p_{\mathrm { T}} >1.48{\mathrm {\,GeV/}c} \) (\(p_{\mathrm { T}} > 1.76{\mathrm {\,GeV/}c} \)) and one of the final-state particles is required to have both \(p_{\mathrm { T}} >1.4{\mathrm {\,GeV/}c} \) (\(p_{\mathrm { T}} >1.6{\mathrm {\,GeV/}c} \)) and an \(\mathrm{IP} > 100{\,\upmu \mathrm {m}} \) with respect to all PVs in the event; if this final-state particle is identified as a muon, \(p_{\mathrm { T}} > 1.0{\mathrm {\,GeV/}c} \) is required instead. Finally, the tracks of two or more of the final-state particles are required to form a vertex that is significantly displaced from all PVs.

In the offline selection, signal candidates are built from a pair of oppositely tracks that are identified as muons. The muon pair is then combined with a charged track that is identified as a kaon by the RICH detectors. The signal candidates are required to pass a set of loose preselection requirements that are identical to those described in Ref. [26]. These requirements exploit the decay topology of \({{B} ^+} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) transitions and restrict the data sample to candidates with good-quality vertex and track fits. Candidates are required to have a reconstructed \({{{K}} ^+} {\mu ^+\mu ^-} \) mass, \(m_{K\mu \mu } \), in the range \(5100<m_{K\mu \mu }<6500{\mathrm {\,MeV\!/}c^2} \).

Combinatorial background, where particles from different decays are mistakenly combined, is further suppressed with the use of a Boosted Decision Tree (BDT) [39, 40] using kinematic and geometric information. The BDT is identical to that described in Ref. [26] and uses the same working point. The efficiency of the BDT for signal is uniform with respect to \(m_{K\mu \mu }\).

Specific background processes can mimic the signal if their final states are misidentified or partially reconstructed. The requirements described in Ref. [26] reduce the overall contribution of the background from such decay processes to a level of less than 1% of the expected signal yield in the full mass region. The largest remaining specific background contribution comes from \({{{B} ^+}} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays (including \({{{B} ^+}} \!\rightarrow {{{J}/\psi }} {{\pi } ^+} \) and \({{{B} ^+}} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \)), where the pion is mistakenly identified as a kaon.

The \({{{K}} ^+} {\mu ^+\mu ^-} \) mass of the selected candidates is shown in Fig. 1. The signal is modelled by the sum of two Gaussian functions and a Gaussian function with power-law tails on both sides of the peak; these all share a common peak position. A Gaussian function is used to describe a small contribution from \({B} _{{c}} ^+\) decays around the known \({B} _{{c}} ^+\) mass [41]. Combinatorial background is described by an exponential function with a negative gradient. At low \(m_{K\mu \mu }\), the background is dominated by partially reconstructed b-hadron decays, e.g. from \(B^{\{+,0\}}\!\rightarrow K^{*\{+,0\}}{\mu ^+\mu ^-} \) decays in which the pion from the \(K^{*\{+,0\}}\) is not reconstructed. This background component is modelled using the upper tail of a Gaussian function. The shape of the background from \({{{B} ^+}} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays is taken from a sample of simulated events. Integrating the signal component in a \(\pm 40\) \({\mathrm {\,MeV\!/}c^2}\) window about the known \({{B} ^+} \) mass [41] yields 980 000 \({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+} {\mu ^-} \) decays.

When computing \(m_{\mu \mu }\), a kinematic fit is performed to the selected candidates. In the fit, the \(m_{K\mu \mu }\) mass is constrained to the known \({{B} ^+} \) mass and the candidate is required to originate from one of the PVs in the event. For simulated \({{{B} ^+}} \!\rightarrow {{{J}/\psi }} {{{K}} ^+} \) decays, this improves the resolution in \(m_{\mu \mu }\) by about a factor of two.

Fig. 1
figure 1

Reconstructed \({{{K}} ^+} {\mu ^+\mu ^-} \) mass of the selected \({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) candidates. The fit to the data is described in the text

4 Differential decay rate

Following the notation of Ref. [42], the \(C\!P\)-averaged differential decay rate of \({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+} {\mu ^-} \) decays as a function of the dimuon mass squared, \({q^2} \equiv m_{\mu \mu }^{2}\), is given by

$$\begin{aligned} \frac{\mathrm {d}\Gamma }{\mathrm {d}q^2}= & {} \frac{G_F^2\alpha ^2|V_{tb}V_{ts}^*|^2}{128\pi ^5} |{\varvec{k}}|\beta \left\{ \phantom {\left| 2\mathcal {C}_7\frac{m_b+m_s}{m_B+m_K} f_T(q^2)\right| ^2}\frac{2}{3}|{\varvec{k}}|^2\beta ^2 \left| \mathcal {C}_{10} f_+(q^2)\right| ^2 \right. \nonumber \\&+\, \frac{4m_{\mu }^2 (m_B^2-m_K^2)^2 }{q^2 m_B^2} \left| \mathcal {C}_{10} f_0(q^2)\right| ^2 \nonumber \\&{ } + \left. |{\varvec{k}}|^2 \left[ 1 - \frac{1}{3}\beta ^2 \right] \left| \mathcal {C}_9 f_+(q^2) + 2\mathcal {C}_7\frac{m_b+m_s}{m_B+m_K} f_T(q^2) \right| ^2 \right\} ,\nonumber \\ \end{aligned}$$
(1)

where \(|{\varvec{k}}|\) is the kaon momentum in the \({{B} ^+} \) meson rest frame. Here \(m_K\) and \(m_B\) are the masses of the \({{K}} ^+\) and \({{B} ^+} \) mesons while \(m_s\) and \(m_b\) refer to the s and b quark masses as defined in Ref. [42], \(m_{\mu }\) is the muon mass and \(\beta ^2=1-4m_\mu ^2/q^2\). The constants \(G_F\), \(\alpha \), and \(V_{tq}\) are the Fermi constant, the QED fine structure constant, and CKM matrix elements, respectively. The parameters \(f_{0,+,T}\) denote the scalar, vector and tensor \(B\rightarrow K\) form factors. The \(\mathcal {C}_i\) are the Wilson coefficients in an effective field theory description of the decay. The coefficient \(\mathcal {C}_9\) corresponds to the coupling strength of the vector current operator, \(\mathcal {C}_{10}\) to the axial-vector current operator and \(\mathcal {C}_7\) to the electromagnetic dipole operator. The operator definitions and the numerical values of the Wilson coefficients in the SM can be found in Ref. [43]. Right-handed Wilson coefficients, conventionally denoted \(\mathcal {C}'_i\), are suppressed in the SM and are ignored in this analysis. The Wilson coefficients \(\mathcal {C}_9\) and \(\mathcal {C}_{10}\) are assumed to be real. This implicitly assumes that there is no weak phase associated with the short-distance contribution. In general, \(C\!P\)-violating effects are expected to be small across the \(m_{\mu \mu }\) range with the exception of the region around the \(\rho \) and \(\omega \) resonances, which enter with different strong and weak phases [44]. The small size of the \(C\!P\) asymmetry between \({{B} ^-} \) and \({{B} ^+} \) decays is confirmed in Ref. [45]. In the present analysis, there is no sensitivity to \(C\!P\)-violating effects at low masses and therefore the phases of the resonances are taken to be the same for \({{B} ^+} \) and \({{B} ^-} \) decays throughout.

Vector resonances, which produce dimuon pairs via a virtual photon, mimic a contribution to \(\mathcal {C}_9\). These long-distance hadronic contributions to the \({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) decay are taken into account by introducing an effective Wilson coefficient in place of \(\mathcal {C}_{9}\) in Eq. 1,

$$\begin{aligned} \mathcal {C}_{9}^\mathrm{eff} = \mathcal {C}_{9}+Y({q^2}), \end{aligned}$$
(2)

where the term \(Y(q^2)\) describes the sum of resonant and continuum hadronic states appearing in the dimuon mass spectrum. In this analysis \(Y({q^2})\) is replaced by the sum of vector meson resonances j such that

$$\begin{aligned} \mathcal {C}_{9}^\mathrm{eff} = \mathcal {C}_{9} + \sum \limits _{j} \eta _{j}e^{i\delta _{j}} A_{j}^\mathrm{res}({q^2}), \end{aligned}$$
(3)

where \(\eta _{j}\) is the magnitude of the resonance amplitude and \(\delta _j\) its phase relative to \(C_9\). These phase differences are one of the main results of this paper. The \(q^2\) dependence of the magnitude and phase of the resonance is parameterised by \(A_{j}^\mathrm{res}({q^2})\). The resonances included are the \(\omega \), \(\rho ^0\), \(\phi \), \({{{J}/\psi }} \), \({\psi {(2S)}} \), \(\psi (3770)\), \(\psi (4040)\), \(\psi (4160)\) and \(\psi (4415)\). Contributions from other broad resonances and hadronic continuum states are ignored, as are contributions from weak annihilation [46,47,48]. No systematic uncertainties are attributed to these assumptions, which are part of the model that defines the analysis framework of this paper.

The function \(A_{j}^\mathrm{res}({q^2})\) is taken to have the form of a relativistic Breit–Wigner function for the \(\omega \), \(\rho ^0\), \(\phi \), \({{{J}/\psi }} \), \({\psi {(2S)}} \) and \(\psi (4040)\), \(\psi (4160)\) and \(\psi (4415)\) resonances,

$$\begin{aligned} A_{j}^\mathrm{res}({q^2}) = \frac{m_{0\,j}\Gamma _{0\,j}}{(m_{0\,j}^{2}-{q^2}) - i m_{0\,j}\Gamma _{j}({q^2})}, \end{aligned}$$
(4)

where \(m_{0\,j}\) is the pole mass of the jth resonance and \(\Gamma _{0\,j}\) its natural width. The running width \(\Gamma _{j}(q^{2})\) is given by

$$\begin{aligned} \Gamma _{j}({q^2}) = \frac{p}{p_{0\,j}}\frac{m_{0\,j}}{\sqrt{{q^2}}}\Gamma _{0\,j}, \end{aligned}$$
(5)

where p is the momentum of the muons in the rest frame of the dimuon system evaluated at q, and \(p_{0\,j}\) is the momentum evaluated at the mass of the resonance. To account for the open charm threshold, the lineshape of the \(\psi (3770)\) resonance is described by a Flatté function [49] with a width defined as

$$\begin{aligned} \Gamma _{\psi (3770)}({q^2}) = \frac{p}{p_{0\,j}}\frac{m_{0\,j}}{\sqrt{{q^2}}}\left[ \Gamma _{1} + \Gamma _{2}\sqrt{\frac{1-(4m_{D}^{2}/{q^2})}{1-(4m_{D}^{2}/q^2_{0})}} \right] \,, \end{aligned}$$
(6)

where \(m_D\) is the mass of the \({D} ^0\) meson and \(q^2_{0}\) is the \(q^2\) value at the pole mass of the \(\psi (3770)\). The coefficients \(\Gamma _{1}=0.3{\mathrm {\,MeV\!/}c^2} \) and \(\Gamma _{2} = 27{\mathrm {\,MeV\!/}c^2} \) are taken from Ref. [41] and correspond to the sum of the partial widths of the \(\psi (3770)\) to states below and above the open charm threshold. For \({q^2} < 4 m_D^2\), the phase-space factor accompanying \(\Gamma _2\) in Eq. 6 becomes complex.

The form factors are parameterised according to Ref. [50] as

$$\begin{aligned} f_0({q^2})&= \frac{1}{1 - {q^2}/m_{B_{s0}^*}^2} \sum \limits _{i=0}^{N-1} b^{0}_i z^i \,, \end{aligned}$$
(7)
$$\begin{aligned} f_{+,T}({q^2})&= \frac{1}{1 - {q^2}/m_{B_s^*}^2} \sum \limits _{i=0}^{N-1} b^{+,T}_i \left[ z^i - (-1)^{i - N} \left( \frac{i}{N}\right) z^{N} \right] \, , \end{aligned}$$
(8)

with, for this analysis, \(N=3\). Here \(m_{B_s^*} (m_{B_{s0}^*})\) is the mass of the lowest-lying excited \(B_s\) meson with \(J^P=1^- (0^+)\). The coefficients \(b^{+}_{i}\) are allowed to vary in the fit to the data subject to constraints from Ref. [42], whereas the coefficients \(b^{0}_{i}\) and \(b^{T}_{i}\) are fixed to their central values. The function z is defined by the mapping

$$\begin{aligned} z({q^2}) \equiv \frac{\sqrt{t_+ - {q^2}} - \sqrt{t_+ - t_0}}{\sqrt{t_+ - {q^2}} + \sqrt{t_+ - t_0}} \end{aligned}$$
(9)

with

$$\begin{aligned} t_+ \equiv (m_B - m_K)^2 \end{aligned}$$
(10)

and

$$\begin{aligned} t_0 \equiv (m_B + m_K)(\sqrt{m_B} - \sqrt{m_K})^2~. \end{aligned}$$
(11)

5 Fit to the \(m_{\mu \mu }\) distribution

In order to determine the magnitudes and phases of the different resonant contributions, a maximum likelihood fit in 538 bins is performed to the distribution of the reconstructed dimuon mass, \(m_{\mu \mu }^\mathrm{rec}\), of candidates with \(m_{K\mu \mu }\) in a \(\pm 40\) \({\mathrm {\,MeV\!/}c^2}\) window about the known \({{B} ^+} \) mass. The \(m^\mathrm{rec}_{\mu \mu }\) distribution of the \({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) decay is described by

$$\begin{aligned} R( m_{\mu \mu }^\mathrm{rec}, m_{\mu \mu } ) \otimes \left( \varepsilon (m_{\mu \mu }) \frac{\mathrm {d}\Gamma }{\mathrm {d}{q^2}} \frac{\mathrm {d}{q^2}}{\mathrm {d}m_{\mu \mu }} \right) \;, \end{aligned}$$
(12)

i.e. by Eq. 1, multiplied by the detector efficiency, \(\varepsilon \), as a function of the true dimuon mass, \(m_{\mu \mu }\), and convolved with the experimental mass resolution R discussed in Sect. 5.2.

5.1 Signal model

The magnitudes and phases of the resonances are allowed to vary in the fit, as are the Wilson coefficients \(\mathcal {C}_9\) and \(\mathcal {C}_{10}\). As the contribution of \(\mathcal {C}_7\) to the total decay rate is small, it is fixed to its SM value of \(\mathcal {C}_7^\mathrm{SM} = -0.304\pm 0.006\) [43].

The form factor \(f_{+}({q^2})\) is constrained in the fit according to its value and uncertainty from Ref. [42]. The form factors \(f_{0}({q^2})\) and \(f_{T}({q^2})\) have a limited impact on the normalisation and shape of Eq. 1, and are fixed to their values from Ref. [42]. The masses and widths of the broad resonances above the open charm threshold are constrained according to their values in Ref. [51]. The masses and widths of the \(\rho \), \(\omega \) and \(\phi \) mesons and the widths of the \({{J}/\psi }\) and \(\psi {(2S)}\) mesons are fixed to their known values [41]. The large magnitude of the \({{J}/\psi }\) and \(\psi {(2S)}\) amplitudes makes the fit very sensitive to the position of the pole mass of these resonances. Due to some residual uncertainty on the momentum scale in the data, the pole masses of the \({{J}/\psi }\) and \(\psi {(2S)}\) mesons are allowed to vary in the fit.

Table 1 Resolution parameters of the different convolution regions in units of \({\mathrm {\,MeV\!/}c^2}\). The \(\alpha _\mathrm{l}\) and \(\alpha _\mathrm{u}\) parameters are shared between the \({{J}/\psi }\) and \(\psi {(2S)}\) regions. The parameters without uncertainties are fixed from fits to the simulated events

The short-distance component is normalised to the branching fraction of \({{{B} ^+}} \!\rightarrow {{{J}/\psi }} {{{K}} ^+} \) measured by the B-factory experiments [41]. After correcting for isospin asymmetries in the production of the \({{B} ^+} \) mesons at the \(\Upsilon (4S)\), the branching fraction is \({\mathcal {B}} ({{{B} ^+}} \!\rightarrow {{{J}/\psi }} {{{K}} ^+} )=(9.95\pm 0.32)\times 10^{-4}\) [52]. This is further multiplied by \({\mathcal {B}} ({{{J}/\psi }} \!\rightarrow {\mu ^+\mu ^-} ) = (5.96 \pm 0.03)\times 10^{-2}\) [41] to account for the decay of the \({{J}/\psi }\) meson. The branching fraction of the decay \({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) via an intermediate resonance j is computed from the fit as

$$\begin{aligned}&\tau _{B} \frac{G_F^2\alpha ^2|V_{tb}V_{ts}^*|^2}{128\pi ^5} \int \limits _{4 m^2_\mu }^{(m_B - m_K)^{2}} |{\varvec{k}}|^3 \left[ \beta - \frac{1}{3}\beta ^3 \right] \nonumber \\&\quad \times \left| f_+(q^2) \right| ^2 \left| \eta _j \right| ^2 \left| A_j^\mathrm{res} ({q^2}) \right| ^2 \mathrm {d}{q^2} \,, \end{aligned}$$
(13)

where \(\tau _B\) is the lifetime of the \({{B} ^+} \) meson. The branching fractions of \({{{B} ^+}} \!\rightarrow \rho {{{K}} ^+} \), \({{{B} ^+}} \!\rightarrow \omega {{{K}} ^+} \), \({{{B} ^+}} \!\rightarrow \phi {{{K}} ^+} \) and \({{{B} ^+}} \!\rightarrow \psi (3770){{{K}} ^+} \) are also constrained assuming factorisation between the \(B \) decay and the subsequent decay of the intermediate resonance to a muon pair. These branching fractions are taken from Ref. [41].

5.2 Mass resolution

The convolution of the resolution function with the signal model is implemented using a fast Fourier transform technique [53, 54]. The fit to the data is performed in three separate regions of dimuon mass: \(300 \le m_{\mu \mu }^\mathrm{rec} \le 1800{\mathrm {\,MeV\!/}c^2} \), \(1800 < m_{\mu \mu }^\mathrm{rec} \le 3400{\mathrm {\,MeV\!/}c^2} \) and \(3400 < m_{\mu \mu }^\mathrm{rec} \le 4700{\mathrm {\,MeV\!/}c^2} \).

To increase the speed of the fit, the resolution is treated as constant within these regions using the resolution at the \(\phi \), \({{J}/\psi }\) and \(\psi {(2S)}\) pole masses. The impact of this assumption on the measured phases of the \({{J}/\psi }\) and \(\psi {(2S)}\) resonances has been tested using pseudoexperiments and found to be negligible. This is to be expected as the spectra in all other regions vary slowly in comparison to the resolution function. The resolution is modelled using the sum of a Gaussian function, G, and a Gaussian function with power-law tails on the lower and upper side of the peak, C,

$$\begin{aligned}&R\left( m^\mathrm{rec}_{\mu \mu },m_{\mu \mu }\right) = f\, G \left( m^\mathrm{rec}_{\mu \mu }, m_{\mu \mu }, \sigma _G\right) \nonumber \\&\quad + (1 - f ) \, C\left( m^\mathrm{rec}_{\mu \mu }, m_{\mu \mu }, \sigma _C, n_\mathrm{l}, n_\mathrm{u}, \alpha _\mathrm{l}, \alpha _\mathrm{u}\right) . \end{aligned}$$
(14)

The component with power-law tails is defined as

$$\begin{aligned}&C\left( m^\mathrm{rec}_{\mu \mu }, m_{\mu \mu }, \sigma _C, n_\mathrm{l}, n_\mathrm{u}, \alpha _\mathrm{l}, \alpha _\mathrm{u}\right) \nonumber \\&\quad \propto \left\{ \begin{array}{ll} A_\mathrm{l} \, \left( B_\mathrm{l} - \delta \right) ^{-n_\mathrm{l}} &{} ~\text {if}~\delta< \alpha _\mathrm{l} \\ \mathrm{exp}(-\delta ^2/2) &{} ~\text {if}~\alpha _\mathrm{l}< \delta < \alpha _\mathrm{u} \\ A_\mathrm{u} \, \left( B_\mathrm{u} + \delta \right) ^{-n_\mathrm{u}} &{} ~\text {if}~\delta > \alpha _\mathrm{u} \\ \end{array} \right. \,, \end{aligned}$$
(15)

with

$$\begin{aligned} \delta= & {} \left( m^\mathrm{rec}_{\mu \mu } - m_{\mu \mu }\right) /\sigma _C \nonumber \\ A_\mathrm{l, u}= & {} \left( \frac{n_\mathrm{l, u}}{|\alpha _\mathrm{l, u}|} \right) ^{n_\mathrm{l, u}} e^{-|\alpha _\mathrm{l, u}|^2/2 }\, \nonumber \\ B_\mathrm{l, u}= & {} \left( \frac{n_\mathrm{l, u}}{|\alpha _\mathrm{l, u}|} \right) - |\alpha _\mathrm{l, u}| \end{aligned}$$
(16)

and is normalised to unity.

The parameters describing the resolution model for the \({{J}/\psi }\) and \(\psi {(2S)}\) regions (f, \(\sigma _C\), \(\sigma _G\), \(n_\mathrm{l}\), \(n_\mathrm{u}\), \(\alpha _\mathrm{l}\), \(\alpha _\mathrm{u}\)) are allowed to vary in the fit to the data. The parameters \(\alpha _\mathrm{l}\), \(\alpha _\mathrm{u}\) and f are shared between the \({{J}/\psi }\) and \(\psi {(2S)}\) regions. The resolution parameters for the \(\phi \) region can not be determined from the data in this way and are instead fixed to their values in the simulation. The resulting values of the resolution parameters are summarised in Table 1. As a cross-check, a second fit to the \(m_{\mu \mu }^\mathrm{rec}\) distribution is performed using the full \(m_{\mu \mu }\) dependence of the resolution model in Eq. 12 and a numerical implementation of the convolution. In this fit to the data, the parameters of the resolution model are taken from simulated \({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) events and fixed up to an overall scaling of the width of the resolution function. The two fits to \(m_{\mu \mu }^\mathrm{rec}\) yield compatible results.

Table 2 Parameters describing the efficiency to trigger, reconstruct and select simulated \({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) decays as a function of \(m_{\mu \mu }\)

5.3 Efficiency correction

The measured dimuon mass distribution is biased by the trigger, selection and detector geometry. The dominant sources of bias are the geometrical acceptance of the detector, the impact parameter requirements on the muons and the kaon and the \(p_{\mathrm { T}}\) dependence of the trigger. Figure 2 shows the efficiency to trigger, reconstruct and select candidates as a function of \(m_{\mu \mu }\) in a sample of simulated \({{B} ^+} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) candidates. The rise in efficiency with increasing dimuon mass originates from the requirement that one of the muons has \(p_{\mathrm { T}} > 1.48{\mathrm {\,GeV/}c} \) (\(p_{\mathrm { T}} > 1.76{\mathrm {\,GeV/}c})\) in the 2011 (2012) trigger. The drop in efficiency at large dimuon mass (small hadronic recoil) originates from the impact parameter requirement on the kaon. The efficiency is normalised to the efficiency at the \({{J}/\psi }\) meson mass and is parameterised as a function of \(m_{\mu \mu }\) by the sum of Legendre polynomials, \(P_i(x)\), up to sixth order,

$$\begin{aligned} \varepsilon (m_{\mu \mu }) = \sum \limits _{i=0}^{6} \varepsilon _{i} P_i \left( -1 + 2\left( \frac{m_{\mu \mu } - 2m_{\mu }}{m_B - m_K - 2 m_{\mu }}\right) \right) . \end{aligned}$$
(17)

The values of the parameters \(\varepsilon _i\) are fixed from simulated events and are given in Table 2.

Fig. 2
figure 2

Efficiency to reconstruct, trigger and select simulated \({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+} {\mu ^-} \) decays as a function of the true dimuon mass. The efficiency is normalised to the efficiency at the \({{J}/\psi }\) meson mass. The band indicates the efficiency parameterisation used in this analysis and its statistical uncertainty

5.4 Background model

The reconstructed dimuon mass distribution of the combinatorial background candidates is taken from the \(m_{K\mu \mu }\) upper mass sideband, \(5620< m_{K\mu \mu } < 5700{\mathrm {\,MeV\!/}c^2} \). When evaluating \(m^\mathrm{rec}_{\mu \mu }\), \(m_{K\mu \mu }\) is constrained to the centre of the sideband rather than to the known \({{B} ^+} \) mass. Combinatorial background comprising a genuine \({{J}/\psi }\) or \(\psi {(2S)}\) meson is described by the sum of two Gaussian functions. After applying the mass constraint, the means of the Gaussians do not correspond exactly to the known \({{J}/\psi }\) and \(\psi {(2S)}\) masses. Combinatorial background comprising a dimuon pair that does not originate from a \({{J}/\psi }\) or \(\psi {(2S)}\) meson is described by an ARGUS function [55]. The lineshape of the background from \({{{B} ^+}} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays, where the pion is mistakenly identified as a kaon, is taken from simulated events.

6 Results

The dimuon mass distributions and the projections of the fit to the data are shown in Fig. 3. Four solutions are obtained with almost equal likelihood values, which correspond to ambiguities in the signs of the \({{J}/\psi }\) and \(\psi {(2S)}\) phases. The values of the phases and branching fractions of the vector meson resonances are listed in Table 3. The posterior values for the \(f_+\) form factor are reported in Table 4. A \(\chi ^2\) test between the data and the model, with the binning scheme used in Fig. 3, results in a \(\chi ^2\) of 110 with 78 degrees of freedom. The largest disagreements between the data and the model are localised in the \(m_{\mu \mu }\) region close to the \({{J}/\psi }\) pole mass and around 1.8\({\mathrm {\,GeV/}c^2}\). The latter is discussed in Sect. 7.

Fig. 3
figure 3

Fits to the dimuon mass distribution for the four different phase combinations that describe the data equally well. The plots show cases where the \({{J}/\psi }\) and \(\psi {(2S)}\) phases are both negative (top left); the \({{J}/\psi }\) phase is positive and the \(\psi {(2S)}\) phase is negative (top right); the \({{J}/\psi }\) phase is negative and the \(\psi {(2S)}\) phase is positive (bottom left); and both phases are positive (bottom right). The component labelled interference refers to the interference between the short- and long-distance contributions to the decay. The \(\chi ^2\) value of the four solutions is almost identical, with a value of 110 for 78 degrees of freedom

Table 3 Branching fractions and phases for each resonance in the fit for the four solutions of the \({{J}/\psi }\) and \(\psi {(2S)}\) phases. Both statistical and systematic contributions are included in the uncertainties. There is a common systematic uncertainty of 4.5%, dominated by the uncertainty on the \({{{B} ^+}} \rightarrow {{{J}/\psi }} {{{K}} ^+} \) branching fraction, which provides the normalisation for all measurements

The branching fraction of the short-distance component of the \({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+} {\mu ^-} \) decay can be calculated by integrating Eq. 1 after setting the amplitudes of the resonances to zero. This gives

$$\begin{aligned} \mathcal {B}({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+} {\mu ^-} ) = (4.37\pm 0.15\mathrm {\,(stat)} \pm 0.23\mathrm {\,(syst)}) \times 10^{-7}, \end{aligned}$$

where the statistical uncertainty includes the uncertainty on the form-factor predictions. The systematic uncertainty on the branching fraction is discussed in Sect. 7. This measurement is compatible with the branching fraction reported in Ref. [22]. The two results are based on the same data and therefore should not be used together in global fits. The branching fraction reported in Ref. [22] is based on a binned measurement in \(q^2\) regions away from the narrow resonances (\(\phi \), \({{J}/\psi }\) and \(\psi {(2S)}\)) and then extrapolated to the full \(q^2\) range. The contribution from the broad resonances was thus included in that result.

Table 4 Coefficients of the form factor \(f_{+}({q^2})\) as introduced in Eq. 8 with both prior (from Ref. [42]) and posterior values shown
Fig. 4
figure 4

Two-dimensional likelihood profile for the Wilson coefficients \(\mathcal {C}_{9}\) and \(\mathcal {C}_{10}\). The SM point is indicated by the blue marker. The intervals correspond to \(\chi ^2\) probabilities with two degrees of freedom

Table 5 Summary of systematic uncertainties. The branching fraction refers to the short-distance SM contribution. A dash indicates that the uncertainty is negligible

A two-dimensional likelihood profile of \(\mathcal {C}_{9}\) and \(\mathcal {C}_{10}\) is also obtained as shown in Fig. 4. The intervals correspond to \(\chi ^2\) probabilities assuming two degrees of freedom. Only the quadrant with \(\mathcal {C}_{9}\) and \(\mathcal {C}_{10}\) values around the SM prediction is shown. The other quadrants can be obtained by mirroring in the axes. The branching fraction of the short-distance component provides a good constraint on the sum of \(|\mathcal {C}_{9}|^{2}\) and \(|\mathcal {C}_{10}|^{2}\) (see Eq. 1). This gives rise to the annular shape in the likelihood profile in Fig. 4. In addition, there is a modest ability for the fit to differentiate between \(\mathcal {C}_{9}\) and \(\mathcal {C}_{10}\) through the interference of the \(\mathcal {C}_{9}\) component with the resonances. The visible interference pattern excludes very small values of \(|\mathcal {C}_{9}|\). Overall, the correlation between \(\mathcal {C}_{9}\) and \(\mathcal {C}_{10}\) is approximately 90%. The best-fit point for the Wilson coefficients (in a given quadrant of the \(\mathcal {C}_{9}\) and \(\mathcal {C}_{10}\) plane) and the corresponding \({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) branching fraction are the same for the four combinations of the \({{J}/\psi }\) and \(\psi {(2S)}\) phases. Including statistical and systematic uncertainties, the fit results deviate from the SM prediction at the level of 3.0 standard deviations. The uncertainty is dominated by the precision of the form factors. The best-fit point prefers a value of \(|\mathcal {C}_{10}|\) that is smaller than \(|\mathcal {C}_{10}^\mathrm{SM}|\) and a value of \(|\mathcal {C}_{9}|\) that is larger than \(|\mathcal {C}_{9}^\mathrm{SM}|\). However, if \(\mathcal {C}_{10}\) is fixed to its SM value, the fit prefers \(|\mathcal {C}_{9}| < |\mathcal {C}_{9}^\mathrm{SM}|\). This is consistent with the results of global fits to \(b\!\rightarrow s\ell ^+\ell ^-\) processes. Given the model assumptions in this paper, the interference with the \({{J}/\psi }\) meson is not able to explain the low value of the branching fraction of the \({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) decay while keeping the values of \(\mathcal {C}_{9}\) and \(\mathcal {C}_{10}\) at their SM predictions.

7 Systematic uncertainties

Sources of systematic uncertainty are considered separately for the phase and branching fraction measurements. In both cases, the largest systematic uncertainties are accounted for in the statistical uncertainty as they are included as nuisance parameters in the fit. For smaller sources of uncertainty, the fit is repeated with variations of the inputs and the difference is assigned as a systematic uncertainty. A summary of the remaining systematic uncertainties can be found in Table 5.

The parameters governing the behaviour of the tails of the resolution function are particularly correlated with the phases. The systematic uncertainty on the resolution model is included in the statistical uncertainty by allowing the resolution parameter values to vary in the fit. If the tail parameters are fixed to their central values, the statistical uncertainties on the phase measurements decrease by approximately 20%. The choice of parameterisation for the resolution model is validated using a large sample of simulated events and no additional uncertainty is assigned for the choice of model. For the branching fraction measurement, the uncertainty arising from the resolution model is negligible compared to other sources of systematic uncertainty.

Similarly to the resolution model, the systematic uncertainty associated with the knowledge of the \(f_{+}({q^2})\) form factor is included in the statistical uncertainty. If the form-factor parameters are fixed to their best-fit values, the statistical uncertainties on the phases decrease by 4% (1%) for the \({{J}/\psi }\) (\(\psi {(2S)}\)) measurements. For the branching fraction, the uncertainty is 2%, which is of similar size as the statistical uncertainty.

At around \(m_{\mu \mu } =1.8\) \({\mathrm {\,GeV/}c^2}\) there is a small discrepancy between the data and the model (see Fig. 3). This is interpreted as a possible contribution from excited \(\rho \), \(\omega \) or \(\phi \) resonances. Given the limited knowledge of the masses and widths of the states in this region, these broad states are neglected in the nominal fit. They are, however, visible in \({{e} ^+{e} ^-} \rightarrow \mathrm hadrons\) vacuum polarisation data [41]. To test the effect of such states on the phases of the \({{J}/\psi }\) and \(\psi {(2S)}\) mesons, an additional relativistic Breit–Wigner amplitude is included with a width and mass that are allowed to vary in the fit. The inclusion of this Breit–Wigner amplitude marginally improves the fit quality around \(m_{\mu \mu } =1.8\) \({\mathrm {\,GeV/}c^2}\) and changes the \({{J}/\psi }\) (\(\psi {(2S)}\)) phase by 40% (20%) of its statistical uncertainty, which is added as a systematic effect. The magnitude of the amplitude is not statistically significant and its mean and width do not correspond to a known state. The phases of the other resonances in the fit have larger statistical uncertainties and the inclusion of this additional amplitude has a negligible effect on their fit values. Given that the contribution of this amplitude is small compared to the short-distance component, its effect on the branching fraction is only around 1%.

Other, smaller systematic uncertainties include modelling of the combinatorial background, calculation of the efficiency as a function of \(q^2\) and the uncertainty on the \({{{B} ^+}} \!\rightarrow {{{J}/\psi }} {{{K}} ^+} \) branching fraction. The latter affects the branching fraction measurement and is obtained from Ref. [52], which results in a \(4\%\) uncertainty.

8 Conclusions

This paper presents the first measurement of the phase difference between the short- and long-distance contributions to the \({{B} ^+} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) decay. The measurement is performed using a binned maximum likelihood fit to the dimuon mass distribution of the decays. The long-distance contributions are modelled as the sum of relativistic Breit–Wigner amplitudes representing different vector meson resonances decaying to muon pairs, each with their own magnitude and phase. The short-distance contribution is expressed in terms of an effective field theory description of the decay with the Wilson coefficients \(\mathcal {C}_9\) and \(\mathcal {C}_{10}\), which are taken to be real. These are left free in the fit and all other components set to their corresponding SM values. The \(B\rightarrow K\) hadronic form factors are constrained in the fit to the predictions from Ref. [42].

The fit results in four approximately degenerate solutions corresponding to ambiguities in the signs of the \({{J}/\psi }\) and \(\psi {(2S)}\) phases. The values of the \({{J}/\psi }\) phases are compatible with \(\pm \tfrac{\pi }{2}\), which means that the interference with the short-distance component in dimuon mass regions far from their pole masses is small. The negative solution of the \({{J}/\psi }\) phase agrees qualitatively with the prediction of Ref. [47], where long-distance contributions are calculated at negative \(q^2\) and extrapolated to the \(q^2\) region below the \({{J}/\psi }\) pole-mass using a hadronic dispersion relation. The fit model, which includes the conventional \(J^{PC}=1^{--}\) \(c\bar{c}\) resonances, is found to describe the data well, with no significant evidence for the decays \({{{B} ^+}} \!\rightarrow \psi (4040){{{K}} ^+} \) or \({{{B} ^+}} \!\rightarrow \psi (4415){{{K}} ^+} \). The values of the \(\psi (3770)\) and \(\psi (4160)\) phases are compatible with those reported in Ref. [13].

The measurement of the Wilson coefficients prefers a value of \(|\mathcal {C}_{10}| < |\mathcal {C}_{10}^\mathrm{SM}|\) and a value of \(|\mathcal {C}_{9}| > |\mathcal {C}_{9}^\mathrm{SM}|\). If the value of \(\mathcal {C}_{10}\) is set to that of \(\mathcal {C}_{10}^\mathrm{SM}\), the measurement favours the region \(|\mathcal {C}_{9}| < |\mathcal {C}_{9}^\mathrm{SM}|\). These results are similar to those reported previously in global analyses. The interference between the short- and long-distance contributions in the regions around the \(\rho \), \(\omega \) and the \(\phi \), and in the region \({q^2} > m_{{\psi {(2S)}}}^2\), results in the exclusion of the hypothesis that \(\mathcal {C}_{9} = 0\) at more than 5 standard deviations. The dominant uncertainty on the measurements of \(\mathcal {C}_{9}\) and \(\mathcal {C}_{10}\) arises from the knowledge of the \(B\rightarrow K\) hadronic form factors. The current data set allows the uncertainties on these hadronic parameters to be reduced. Improved inputs on the form factors from lattice QCD calculations and the larger data set that will be available at the end of the LHC Run 2 are needed to further improve the measurement of the Wilson coefficients.

A similar strategy to the one applied in this paper can be extended to other \(b\rightarrow s\ell ^+\ell ^-\) decay processes to understand the influence of hadronic resonances on global fits for \(\mathcal {C}_{9}\) and \(\mathcal {C}_{10}\). However, the situation is more complicated in decays where the strange hadron is not a pseudoscalar meson as the amplitudes corresponding to different helicity states of the hadron can have different relative phases.

Finally, a measurement of the branching fraction of the short-distance component of \({{B} ^+} \!\rightarrow {{{K}} ^+} {\mu ^+\mu ^-} \) decays is also reported and is found to be

$$\begin{aligned} \mathcal {B}({{{B} ^+}} \!\rightarrow {{{K}} ^+} {\mu ^+} {\mu ^-} ) = (4.37\pm 0.15\mathrm {\,(stat)} \pm 0.23\mathrm {\,(syst)}) \times 10^{-7}\; , \end{aligned}$$

where the first uncertainty is statistical and second is systematic. In contrast to previous analyses, the measurement is performed across the full \(q^2\) region accounting for the interference with the long-distance contributions and without any veto of resonance-dominated regions of the phase space. The value of the branching fraction is found to be compatible with previous measurements [22], but smaller than the SM prediction [42].