First observations of the rare decays $B^+->K^+\pi^+\pi^-\mu^+\mu^-$ and $B^+->\phi K^+\mu^+\mu^-$

First observations of the rare decays $B^+\rightarrow K^+\pi^+\pi^-\mu^+\mu^-$ and $B^+\rightarrow \phi K^+\mu^+\mu^-$ are presented using data corresponding to an integrated luminosity of $3.0\,{fb}^{-1}$, collected by the LHCb experiment at centre-of-mass energies of $7$ and $8\mathrm{\,TeV}$. The branching fractions of the decays are \begin{eqnarray*} \mathcal{B}(B^+\rightarrow K^+\pi^+\pi^-\mu^+\mu^-)&=&(4.36\,^{+0.29}_{-0.27}\,\mathrm{(stat)}\pm 0.21\,\mathrm{(syst)}\pm0.18\,\mathrm{(norm)})\times10^{-7},\\ \mathcal{B}(B^+\rightarrow\phi K^+\mu^+\mu^-)&=&(0.82 \,^{+0.19}_{-0.17}\,\mathrm{(stat)}\,^{+0.10}_{-0.04}\,\mathrm{(syst)}\pm0.27\,\mathrm{(norm)}) \times10^{-7},\end{eqnarray*} where the uncertainties are statistical, systematic, and due to the uncertainty on the branching fractions of the normalisation modes. A measurement of the differential branching fraction in bins of the invariant mass squared of the dimuon system is also presented for the decay $B^+\rightarrow K^+\pi^+\pi^-\mu^{+}\mu^{-}$.


Introduction
The B + →K + π + π − µ + µ − and B + → φK + µ + µ − decays proceed via b → s flavour changing neutral currents (FCNC). 1 In the Standard Model (SM), FCNC decays are forbidden at the tree level and are only allowed as higher-order electroweak loop processes. In extensions of the SM, new particles can significantly change the branching fractions and angular distributions of the observed final-state particles. Due to their sensitivity to effects beyond the SM, semileptonic B decays involving FCNC transitions are currently under intense study at the LHCb experiment [1][2][3][4].
-1 -JHEP10(2014)064 mixing angle θ K 1 . However, due to the unknown resonance structure of the final-state hadrons, there are no inclusive theoretical predictions available for the branching fractions of the decays B + →K + π + π − µ + µ − and B + → φK + µ + µ − . This paper presents the first observations of the decays B + →K + π + π − µ + µ − and B + → φ(1020)K + µ + µ − , using a data sample collected by the LHCb experiment, corresponding to an integrated luminosity of 3.0 fb −1 . The data were recorded in the years 2011 and 2012 at centre-of-mass energies of 7 and 8 TeV, respectively. In addition, a measurement of the differential branching fraction dB(B + →K + π + π − µ + µ − )/dq 2 , where q 2 is the invariant mass squared of the dimuon system, is presented.

The LHCb detector
The LHCb detector [18] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream. The tracking system provides a measurement of momentum, p, with a relative uncertainty that varies from 0.4% at low momentum to 0.6% at 100 GeV/c. The minimum distance of a track to a primary pp interaction vertex (PV), the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) µm, where p T is the the component of p transverse to the beam, in GeV/c. Charged hadrons are identified using two ring-imaging Cherenkov detectors (RICH) [19]. Photon, electron and hadron candidates are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [20]. The trigger [21] consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction.
Simulated events are used to determine trigger, reconstruction and selection efficiencies. In addition, simulated samples are used to estimate possible backgrounds from B meson decays that can mimic the final states of the signal decays. Simulated events are generated using Pythia [22,23] with a specific LHCb configuration [24]. Decays of hadronic particles are described by EvtGen [25], in which final-state radiation is generated using Photos [26]. The interaction of the generated particles with the detector and its response are implemented using the Geant4 toolkit [27,28] as described in ref. [29].

Selection of signal candidates
The B + →K + π + π − µ + µ − and B + → φK + µ + µ − signal candidates are first required to pass the hardware trigger stage, which selects muons with p T > 1.76 GeV/c. In the subsequent software trigger stage, at least one of the final-state hadrons (muons) is required to have both p T > 1.6 GeV/c (1.0 GeV/c) and IP larger than 100 µm with respect to any PV in -2 -JHEP10(2014)064 the event. A multivariate algorithm [30] is used to identify secondary vertices that are consistent with the decay of a b hadron with muons in the final state.
Signal candidates are formed by combining two muons of opposite charge with three charged hadrons. Reconstructed signal candidate tracks must have significant displacement from any PV in the event. The signal candidate tracks are required to form a secondary vertex of good fit quality which is significantly displaced from the PV. Particle identification information from the RICH detectors (PID) is used to identify the final-state hadrons. For B + →K + π + π − µ + µ − decays, the invariant mass of the K + π + π − system is required to be below 2400 MeV/c 2 . For B + → φK + µ + µ − decays with φ → K + K − , the invariant mass of the K + K − system is required to be within 12 MeV/c 2 of the known φ meson mass [31]. This mass region contains almost entirely φ → K + K − meson decays with negligible background.
The final states of the signal decays can be mimicked by other B decays, which represent potential sources of background. Resonant decays, where the muon pair originates from either J/ψ or ψ(2S) meson decays, are removed by rejecting events where the invariant mass of the dimuon system is in the veto regions 2946 < m(µ + µ − ) < 3176 MeV/c 2 or 3586 < m(µ + µ − ) < 3766 MeV/c 2 . The radiative tails of the J/ψ (ψ(2S)) decays are suppressed by extending the lower edge of these veto regions down by 250 MeV/c 2 (100 MeV/c 2 ) if the reconstructed B + mass is smaller than 5230 MeV/c 2 . In the mass region 5330 < m(B + ) < 5450 MeV/c 2 the upper edge of the vetoes is extended up by 40 MeV/c 2 to reject a small fraction of misreconstructed J/ψ and ψ(2S) meson decays. The resonant decays can also be misreconstructed as signal if a muon from the charmonium decay is misidentified as a hadron and vice versa. To remove this potential background the invariant mass of the µ + π − or µ + K − system is calculated assigning the muon mass to the hadron. If the mass falls within 50 MeV/c 2 of the known J/ψ or ψ(2S) masses [31], the candidate is rejected.
The BDT training uses sWeighted [34] candidates from the control channel B + →J/ψ K + π + π − as a signal proxy and the high B + mass sideband (5529 < m(K + π + π − µ + µ − ) < 5780 MeV/c 2 ) of B + →K + π + π − µ + µ − candidates as a background proxy. The BDT uses geometric and kinematic variables in the training, including the p T of the final state tracks and their displacement from the PV. Additionally, the p T of the reconstructed B + candidate, as well as information on the quality of the decay vertex and its displacement are used. Requirements on the BDT response and the PID criteria, which discriminate between kaons and pions for the reconstructed final-state hadrons, are optimised simultaneously using the metric S/ √ S + B. Here, S and B denote the expected signal and background yields. The value of S is calculated using an estimate for the branching fraction of the decay B + → K 1 (1270) + µ + µ − . This branching fraction is determined by scaling that of the rare decay B 0 → K * 0 µ + µ − [1] by the branching fraction ratio of the radiative decays B + → K 1 (1270) + γ and B 0 → K * 0 γ [31].

JHEP10(2014)064
To determine the branching fractions of the signal decays, the normalisation modes B + →ψ(2S)K + , with the subsequent decay ψ(2S) → J/ψ (→ µ + µ − )π + π − , and B + → J/ψ φK + are used. The branching fraction of the decay B + →ψ(2S)K + is (6.27 ± 0.24) × 10 −4 [31], and the branching fraction of the decay B + → J/ψ φK + is (5.2 ± 1.7) × 10 −5 [31]. The final states of the normalisation modes are identical to those of the signal decays, which is beneficial since many systematic effects are expected to cancel. Both normalisation modes are selected in analogy to the signal decays except for additional mass requirements. For the ψ(2S) decay, the reconstructed π + π − µ + µ − mass is required to be within 60 MeV/c 2 of the known ψ(2S) mass. The reconstructed invariant mass of the dimuon system originating from the J/ψ meson decay is required to be within 50 MeV/c 2 of the known J/ψ mass.
4 Differential branching fraction of the decay The determination of the differential branching fraction dB(B + →K + π + π − µ + µ − )/dq 2 is performed in bins of q 2 , as given in table 1. Figure 1 shows the invariant mass distribution of B + →K + π + π − µ + µ − candidates in each q 2 bin studied. Signal yields are determined using extended maximum likelihood fits to the unbinned K + π + π − µ + µ − mass spectra. The m(K + π + π − µ + µ − ) distribution of the signal component is modelled using the sum of two Gaussian functions, each with a power-law tail on the low-mass side. The background component is modelled with an exponential function, where the reductions in efficiency due to the vetoes of the radiative tails of the charmonium decays are accounted for by using scale factors. The signal yield integrated over the full q 2 range is N Kππµµ = 367 +24 −23 . The statistical significance of the signal is in excess of 20 standard deviations, according to Wilks' theorem [35]. Figure 2a shows the fit to the mass distribution of the control channel B + →J/ψ K + π + π − that is used to determine the parameters describing the mass distribution of the B + →K + π + π − µ + µ − signal decay. To account for partially reconstructed decays at low masses, a Gaussian function is used in addition to the exponential to describe the background component. The yield of the control channel is 59 335 ± 343. Figure 2b shows the fit for the normalisation channel B + →ψ(2S)K + . To describe the mass shape, the same components are used as for the fit of the control decay and all mass shape parameters are allowed to vary in the fit. The yield of the normalisation channel is 5128 ± 67.
The differential branching fraction dB(B + →K + π + π − µ + µ − )/dq 2 in a q 2 bin of width ∆q 2 is Here, N sig is the yield of the signal channel in the given q 2 bin and N norm the yield of the normalisation channel. The efficiencies for the reconstruction and selection of the signal and normalisation channels are denoted by sig and norm , respectively. The efficiency for the signal decay is determined using simulated B + → K 1 (1270) + µ + µ − events generated according to ref. [17]; a separate efficiency ratio is calculated for each q 2 bin.  Figure 1. Invariant mass of B + →K + π + π − µ + µ − candidates in bins of q 2 with fit projections overlaid. The signal component (shaded light blue) is modelled by the sum of two Gaussian functions, each with a power-law tail at low mass. The background component (shaded dark blue) is modelled by an exponential function. In the q 2 ranges 4.30 < q 2 < 8.68 GeV 2 /c 4 , 10.09 < q 2 < 12.86 GeV 2 /c 4 , and 14.18 < q 2 < 19.00 GeV 2 /c 4 , scaling factors are applied to account for the vetoes of the radiative tails of the charmonium resonances, resulting in steps in the background mass shape. The lower right plot shows a separate fit to the signal decay integrated over all q 2 bins. The branching fraction for the ψ(2S) meson to decay to the final state π + π − µ + µ − is B(ψ(2S) → J/ψ (→ µ + µ − )π + π − ) = (2.016 ± 0.031) × 10 −2 [31].
The resulting differential branching fractions for the decay B + →K + π + π − µ + µ − are shown in figure 3 with numerical values given in table 1. Summation over all q 2 bins yields an integrated branching fraction of 3.43 +0. 23 −0.21 (stat) ± 0.15 (syst) ± 0.14 (norm) × 10 −7 , where the uncertainties are statistical, systematic, and due to the uncertainty on the normalisation channel. The fraction of signal events removed by the vetoes of the charmonium regions is determined from simulated B + → K 1 (1270) + µ + µ − events to be  2.75 +0.29 −0.28 ± 0.16 Table 1. Signal yields for the decay B + →K + π + π − µ + µ − and resulting differential branching fractions in bins of q 2 . The first contribution to the uncertainty is statistical, the second systematic, where the uncertainty due to the branching fraction of the normalisation channel is included. The q 2 binning used corresponds to the binning used in previous analyses of b → sµ + µ − decays [1][2][3]. Results are also presented for the q 2 range from 1 to 6 GeV 2 /c 4 , where theory predictions are expected to be most reliable.
(21.3 ± 1.5)%. The uncertainty on this number is determined from a variation of the angle θ K 1 and the form-factor parameters within their uncertainties. Correcting for the charmonium vetoes yields a total branching fraction of Since the systematic uncertainty due to the normalisation channel is significant, we also report the branching ratio of the signal channel with respect to its normalisation mode, which is determined to be Due to the low signal yield, no attempt is made to resolve the different contributions to the K + π + π − system in the K + π + π − µ + µ − final state. However, it is possible to obtain the m(K + π + π − ) distribution using the sPlot [34] technique. Figure 4 shows this distribution for the signal decay in the full q 2 region, as well as for the control decay B + →J/ψ K + π + π − .   Background-subtracted m(K + π + π − ) distributions for (a) the signal decay B + →K + π + π − µ + µ − and (b) the control channel B + →J/ψ K + π + π − . The vertical lines indicate the masses of the K 1 (1270) + and K 1 (1400) + resonances.
For the signal decay B + →K + π + π − µ + µ − the data are consistent with the presence of several broad and overlapping resonances.

Systematic uncertainties
The dominant systematic uncertainty comes from the branching fraction of the normalisation mode B + →ψ(2S)K + , which is known to a precision of 6%. This uncertainty is fully correlated between the q 2 bins and is quoted separately.
The systematic uncertainty introduced by the choice of signal mass model is estimated by re-evaluating the signal yield using a single Gaussian function with a power-law tail. To estimate the uncertainty of the background mass model, a linear mass shape is used instead of the nominal exponential function. The total systematic uncertainty assigned due to the modelling of the mass distribution is approximately 2%.
The majority of systematic effects bias the efficiency ratio norm / sig , which is determined using simulation. To account for differences between data and simulation, correc-

JHEP10(2014)064
tions based on data are applied to simulated events. The efficiency to identify kaons is corrected by using large D * + → D 0 (→ K − π + )π + control samples. Muon identification performance and tracking efficiency are corrected using J/ψ → µ + µ − decays. In addition, track multiplicity and vertex fit quality are weighted according to the control channel B + →J/ψ K + π + π − . The systematic uncertainties associated with these corrections are evaluated by determining the branching fraction without the correction and taking the full observed deviation as a systematic uncertainty. In total, they constitute a systematic uncertainty of around 1%. The software trigger is observed to be well described in simulation, but slight discrepancies are observed for the hardware stage. These are corrected by weighting the simulated samples according to the maximum muon p T . The branching fraction is recalculated without these weights, and the observed difference of 1% is assigned as the systematic uncertainty from the trigger simulation.
Additional systematic uncertainties stem from the fact that simulated B + → K 1 (1270) + µ + µ − events, modelled according to ref. [17], are used to determine the efficiency ratio norm / sig . To account for contributions other than the K 1 (1270) + to the K + π + π − system, events are weighted according to the m(K + π + π − ) distribution shown in figure 4. This results in a systematic uncertainty of 1-2%, depending on the q 2 range considered. The effect of a potentially different q 2 distribution of the signal decay is evaluated by defining the efficiency ratio using B + → K 1 (1270) + µ + µ − events generated according to a phase-space model. The observed deviation results in a systematic uncertainty of 1-2%.
5 Branching fraction of the decay B + → φK + µ + µ − The signal decay B + → φK + µ + µ − is expected to be rarer than the decay B + →K + π + π − µ + µ − as an ss quark pair must be created from the vacuum. Therefore, only the total branching fraction of this decay mode is determined. Figure 5a shows the B + → φK + µ + µ − signal candidates after the full selection. The signal yield is determined to be N sig = 25.2 +6.0 −5.3 using an extended maximum likelihood fit to the unbinned φK + µ + µ − mass distribution. The statistical significance of the signal, calculated using Wilks' theorem, is 6.6 σ. The signal component is modelled using the sum of two Gaussian functions with a tail described by a power law on the low-mass side. The background mass shape is modelled using a second-order Chebychev polynomial. The parameters describing the signal mass shape are fixed to those determined using the normalisation mode B + → J/ψ φK + , as shown in figure 5b. The yield of the normalisation mode is N norm = 1908 ± 63.
To determine the total branching fraction of the decay B + → φK + µ + µ − , the formula is used. Here, N sig denotes the signal yield determined in a fit where signal candidates are weighted by the relative efficiency norm / sig (q 2 ), according to their q 2 value. This is necessary since the efficiency ratio varies significantly over the full q 2 range. The weights are determined in bins of q 2 , with the same choice of q 2 bins as in table 1. Using the branching fraction of the normalisation channel, the integrated branching fraction is determined to be − 2 )%. This is calculated using simulated B + → φK + µ + µ − events generated according to a phase-space model. The uncertainty is estimated by comparison with the model given in ref. [17] for the decay B + → K 1 (1270) + µ + µ − and weighting to correct for the large mass of the φK + system. Accounting for the charmonium vetoes results in a total branching fraction of The branching fraction of the signal channel with respect to its normalisation mode is determined to be B(B + → φK + µ + µ − ) B(B + → J/ψ φK + ) = 1.58 +0.36 −0.32 (stat) +0.19 −0.07 (syst) × 10 −3 .

Systematic uncertainties
The main systematic uncertainty arises from the measurement of the branching fraction of the normalisation channel, which is known to 33% [31]. The systematic uncertainty due to the choice of signal mass model is determined by using a single Gaussian function with power-law tail on the low-mass side to determine the signal yield. For the background mass model, a first-order polynomial, instead of the nominal second-order polynomial, is used. The total systematic uncertainty from the model used to describe the m(φK + µ + µ − ) distribution is 3%. The majority of the systematic uncertainties affect the efficiency ratio norm / sig (q 2 ) and arise from the corrections based on data that are applied to simulation, as described in section 4.1. The systematic uncertainties caused by these corrections are determined to be 1% in total. The limited size of the simulated samples available to calculate the efficiency ratio introduces an uncertainty of 1.5%. Imperfect modelling of the hardware trigger is corrected for in the same way as for the measurement of B(B + →K + π + π − µ + µ − ) in section 4 and results in a systematic uncertainty of 1.5%. The efficiency ratio norm / sig (q 2 ) is determined using simulated B + → φK + µ + µ − events generated according to a phase-space model. The uncertainty due to the q 2 distribution in the bins is evaluated by weighting simulated events to reproduce the q 2 distribution of B + → K 1 (1270) + µ + µ − decays. This leads to a systematic uncertainty of 1.5%.