1 Introduction

The \({B} _{c} ^+\)  meson is made of the heaviest quarks that bind to form hadrons in the Standard Model (SM): beauty and charm quarks. The presence of two heavy quarks distinguishes the system from other \(B \)  mesons in theoretical calculations. Hence, measurements of the branching fractions of \({B} _{c} ^+\) decay modes can provide unique tests of the understanding of Quantum Chromodynamics.

The \({B} _{c} ^+\)  meson can decay in various ways as each constituent quark can undergo weak decay with the other as a spectator. The availability of large data samples at the LHC has allowed the observation of the case where the \(c \)  quark decays [1, 2], complementing a range of observations of transitions to final states mediated by \(b \)-quark decays. The \({B} _{c} ^+\) meson can also decay through the annihilation of the \(\overline{{b}}\) and \(c \)  quarks into a virtual \(W \)  boson, producing either leptonic, semileptonic or hadronic final states. The leptonic final states are challenging to study in a hadron collider environment, because of the missing energy carried by the neutrino, but are of great interest for investigation at a future high-luminosity \(e ^+e ^-\)  collider operating at the \(Z \)  pole [3,4,5,6].

Several \({B} _{c} ^+\)  decays receiving contributions from annihilation diagrams have been searched for at LHCb, including decays into hadronic final states [7,8,9] and semileptonic \({{B} _{c} ^+} \!\rightarrow {{D} ^+_{s}} {\mu ^+\mu ^-} \) decays [10] (the inclusion of charge conjugate processes is implied throughout the paper). However, there has been no previous search for \({B} _{c} ^+\)  decays to semileptonic final states mediated only by annihilation diagrams, such as nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays. A further interest in these processes is that they may receive contributions from resonant \({{B} _{c} ^+} \!\rightarrow {{B} ^{*0}_{({s})}} {{\pi } ^+} \)  decays where the vector \({B} ^{*0}_{({s})}\)  state decays via a flavour changing neutral current transition into the \(\mu ^+\mu ^-\) final state [11]. Such \({B} ^{*0}_{({s})}\)  decays are highly suppressed in the SM [12, 13] but could be enhanced in the presence of physics beyond the SM.

In the SM, nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays occur through the annihilation of the \(\overline{{b}}\) and \(c \)  quarks into a virtual \(W \)  boson (that decays into the \(u \) and \(\overline{{d}}\)  quarks that produce the charged pion) with radiation of a virtual photon or \(Z \)  boson (that decays into a muon pair). Currently, there are no theoretical predictions for this decay mode. Weak annihilation contributions to \({B} ^+\)  decays into the same final state have been investigated as these can result in interesting phenomenology including \(C\!P\)-violation effects [14,15,16,17,18]. It is, however, unclear to what extent the theoretical approaches used in these studies are applicable to \({B} _{c} ^+\)  decays.

This paper describes the first search for nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays exploiting proton–proton (pp) collision data collected with the LHCb detector, corresponding to an integrated luminosity of \(9\,\text {fb} ^{-1} \). The analysis strategy is based on the reconstruction of \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) candidates, sorted into intervals of the dimuon invariant mass, \(m({\mu ^+\mu ^-})\), that are expected to be dominated by either nonresonant or resonant contributions. The intervals used are shown in Table 1 (natural units with \(\hbar = c = 1\) are used here and throughout the paper). These include regions around the \({J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }\) and \(\psi {(2S)}\) resonances, where the \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) decay mode is used as a control and normalisation channel and both the \({J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }\) and \(\psi {(2S)}\) resonances are reconstructed in \(\mu ^+\mu ^-\) final states. The presence of \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) decays into the same final state allows an updated measurement of the ratio

$$\begin{aligned} R_{{\psi {(2S)}}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }}} \equiv \frac{{\mathcal {B}} ({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} )}{{\mathcal {B}} ({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} )} \,\text {,} \end{aligned}$$

which has previously been measured at LHCb with smaller data samples [19, 20]. The \(m({\mu ^+\mu ^-})\) intervals to select resonant decays correspond to about \(\pm 4\) times the dimuon invariant-mass resolution centred on the known resonance mass [21]. The intervals for the nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \)  decay mode are defined in terms of the dimuon mass squared, \(q^2\), and include two intervals (low and central) below the \({J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }\) resonance, one (intermediate) between the \({J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }\) and the \(\psi {(2S)}\) resonances, and one (high) above the \(\psi {(2S)}\) resonance. The latter lies above the production threshold for two charmed hadrons and will receive contributions from broad charmonium resonances.

Table 1 Ranges of dimuon invariant mass or mass squared (\(q^2\)) used in the analysis

For each of the intervals, a fit is performed using the \({B} _{c} ^+\) candidate invariant mass, \(m({{\pi } ^+} {\mu ^+\mu ^-})\), as discriminating observable. The yield, relative to that for the \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) normalisation mode, is converted to a result on the relative branching fraction of the given \(q^2\) bin,

$$\begin{aligned}{} & {} R_{{{\pi } ^+} {\mu ^+\mu ^-}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}} \equiv \frac{\mathcal{B}({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} )}{\mathcal{B}({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} )} \nonumber \\{} & {} = \frac{N_{{{\pi } ^+} {\mu ^+\mu ^-}}}{N_{{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}} \cdot \frac{\varepsilon _{{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}}{\varepsilon _{{{\pi } ^+} {\mu ^+\mu ^-}}} \cdot \mathcal{B}({{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} \!\rightarrow {\mu ^+\mu ^-} )\,\text {,} \end{aligned}$$
(1)

where N indicates a yield, \(\varepsilon \) indicates the efficiency determined from simulation with data-driven corrections, and \(\mathcal{B}({{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} \!\rightarrow {\mu ^+\mu ^-} )\) is the known branching fraction of the \({{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} \!\rightarrow {\mu ^+\mu ^-} \) decay [21]. Here, \(\mathcal{B}({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} )\) indicates the \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) differential branching fraction integrated over the \(q^2\) range relevant for the given bin.

2 Detector and simulation

The LHCb detector [22, 23] is a single-arm forward spectrometer covering the pseudorapidity range \(2<\eta <5\), designed for the study of particles containing \(b \) or \(c \)  quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region [24], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about \(4{\mathrm {\,T\,m}}\), and three stations of silicon-strip detectors and straw drift tubes [25, 26] placed downstream of the magnet. The tracking system provides a measurement of the momentum, \(p\), of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200\(\,\text {Ge V}\). The minimum distance of a track to a primary pp collision vertex (PV), the impact parameter (IP), is measured with a resolution of \((15+29/p_{\textrm{T}})\,\upmu \text {m} \), where \(p_{\textrm{T}}\) is the component of the momentum transverse to the beam, in \(\,\text {Ge V}\). Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors [27]. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [28]. The online event selection is performed by a trigger [29, 30], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a two-level software stage, which reconstructs the full event.

Simulation is used to optimise the event selection procedure, to model the shape of the \({B} _{c} ^+\) candidate invariant-mass distributions and to estimate efficiencies accounting for the effects of the detector acceptance, reconstruction and selection criteria. In the simulation, pp collisions are generated using Pythia  [31] with a specific LHCb configuration [32]. The production of \({B} _{c} ^+\)  mesons is simulated using the dedicated generator BcVegPy [33]. Decays of unstable particles are described by EvtGen  [34], in which final-state radiation is generated using Photos  [35]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [36, 37] as described in Ref. [38]. The underlying pp interaction is reused multiple times, with an independently generated signal decay for each [39].

The \({B} _{c} ^+\) candidates reconstructed in simulation are weighted to correct for discrepancies between data and simulation associated with the particle-identification [40], track-reconstruction [41] and hardware trigger [42] efficiencies. The simulation is also corrected such that the \({B} _{c} ^+\)  lifetime corresponds to the current experimental value [21, 43, 44]. Additional corrections are applied to account for discrepancies in \({B} _{c} ^+\) production kinematics, event track multiplicity and other observables used in the selection of \({B} _{c} ^+\)  candidates. These corrections are obtained using a multivariate algorithm [45], which is trained using \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) decays in background-subtracted data and simulation. After the corrections are applied, the simulated distributions of all variables used in the analysis are in good agreement with the data.

3 Candidate selection and background sources

Candidate \({{B} _{c} ^+} \rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays are triggered in an identical manner as described in Ref. [46] for \({B} ^+\) decays to the same final states. The hardware stage of the trigger selects events containing at least one muon with high \(p_{\textrm{T}} \). The software stage of the trigger selects events containing at least one high-\(p_{\textrm{T}}\) muon that is inconsistent with originating from any PV. The events must contain at least one secondary vertex (formed by two or more of the final-state particles) that is significantly displaced from every PV. A multivariate algorithm [47, 48] is used to identify secondary vertices consistent with the decay of a \(b \) hadron.

The initial stages of the offline selection are also similar to those for \({{{B} ^+}} \rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays [46], except that less stringent impact parameter and flight distance requirements are imposed to account for the shorter \({B} _{c} ^+\)  lifetime compared to that of the \({{B} ^+} \)  meson. The \({B} _{c} ^+\)  candidates are formed from pairs of well-reconstructed oppositely charged tracks that are identified as muons together with a track identified as a pion, and are required to have a good-quality vertex. Each \({B} _{c} ^+\)  candidate is associated with the PV giving the smallest value of \(\chi ^2_\textrm{IP}\), which is defined as the difference in the vertex-fit \(\chi ^2\) of the PV reconstructed with and without the candidate. Each \({B} _{c} ^+\)  candidate must be consistent with originating from its associated PV, and have a momentum vector aligned with the direction between the primary and \({B} _{c} ^+\) decay vertices.

Each \({B} _{c} ^+\) candidate is required to have an invariant mass in the range \(6150< m({{\pi } ^+} {\mu ^+\mu ^-}) < 6700 \,\text {Me V} \). The expected signal resolution in \(m({{\pi } ^+} {\mu ^+\mu ^-})\) corresponds to about \(20\,\text {Me V} \). For the resonant \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) (\({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \)) decay mode, the \({B} _{c} ^+\) candidate mass is calculated from a kinematic fit in which the invariant mass of the \({J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }\)  (\(\psi {(2S)}\)) candidate is constrained to the known \({J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }\)  (\(\psi {(2S)}\)) mass [21], thereby improving the signal resolution by a factor of about two.

Combinatorial background arising from random combinations of tracks is suppressed using a multivariate classifier. A boosted decision tree (BDT) algorithm [49], as implemented in the XGBoost library [50], is trained to identify the \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) signal candidates irrespective of dimuon invariant mass. The signal sample used for the training comprises simulated nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays, resonant \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) decays, and \({{B} _{c} ^+} \!\rightarrow {{B} ^{*0}} ({\mu ^+\mu ^-}){{\pi } ^+} \) and \({{B} _{c} ^+} \!\rightarrow {{B} ^{*0}_{s}} ({\mu ^+\mu ^-}){{\pi } ^+} \) decays. The latter two populate the high \(q^2\) region, which is sparsely populated in the nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) simulation. The background training sample comprises data from low- and high-mass sideband regions corresponding to \(5550< m({{\pi } ^+} {\mu ^+\mu ^-}) < 5850 \,\text {Me V} \) and \(6700< m({{\pi } ^-} {\mu ^+\mu ^-}) < 7000\,\text {Me V} \), with vetoes to remove candidates containing \({{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} \!\rightarrow {\mu ^+\mu ^-} \) and \({\psi {(2S)}} \!\rightarrow {\mu ^+\mu ^-} \) decays. The features of the data used to classify \({B} _{c} ^+\)  candidates are: the \(p_{\textrm{T}}\) of the pion track; the \(p_{\textrm{T}}\) of the muon track with highest \(p_{\textrm{T}}\); the IPs of the muon tracks and the \({B} _{c} ^+\)  candidate, the \({B} _{c} ^+\) flight distance, the vertex quality of the \({B} _{c} ^+\)  candidate; and the largest distance of closest approach between any two of the final-state particles.

Requirements on the BDT output and variables characterising the charged pion particle identification are optimised simultaneously using a grid search to obtain the best signal sensitivity. The requirements are optimised independently for the search for the nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays and for the measurement of \(R_{{\psi {(2S)}}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }}}\). For the search for the \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decay, the optimisation is based on the figure of merit \(\varepsilon / (5/2 + \sqrt{N_\textrm{B}})\) [51], where \(\varepsilon \) is the efficiency on the combined signal sample and \(N_\textrm{B}\) is the expected number of background candidates in the signal region (corresponding to an interval of about \(\pm 3\) times the \({B} _{c} ^+\)  invariant-mass resolution centred on the known \({B} _{c} ^+\)  mass [21]). The expected background yield is estimated by fitting a background-only model to the dataset excluding the signal region. For the measurement of \(R_{{\psi {(2S)}}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }}}\), the optimisation is based on the figure of merit \(N_\textrm{S}/\sqrt{N_\textrm{S}+N_\textrm{B}}\), where \(N_\textrm{S}\) and \(N_\textrm{B}\) correspond to the expected number of signal and background events in the signal region for the \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) decay. The expectation for \(N_\textrm{S}\) is calculated from the signal efficiency and the \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) decay yield from a fit to the data with only a loose requirement on the BDT output applied. The fit model is described below.

The optimal requirements for the nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) search and for the \(R_{{\psi {(2S)}}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }}}\) measurement are referred to subsequently as the \({{\pi } ^+} {\mu ^+\mu ^-} \) and the \({\psi {(2S)}} {{\pi } ^+} \) working points (WPs), respectively. At the \({{\pi } ^+} {\mu ^+\mu ^-} \) WP, the classifier has a combinatorial background rejection power of \(99.9\%\), whilst retaining \(27.1\%\) of signal decays. At the \({\psi {(2S)}} {{\pi } ^+} \) WP, the classifier has a combinatorial background rejection power of \(98.8\%\), whilst retaining \(63.3\%\) of signal decays. In both cases, the particle identification requirements have a pion efficiency around \(90\%\), with a kaon misidentification rate around \(10\%\). The particle identification requirements applied on the muon candidates have an efficiency around \(99\%\), with a pion misidentification rate below \(1\%\). To measure branching fraction ratios, the same selection is used for the signal and for the normalisation modes, such that potential systematic biases on the measurement are reduced. For both WPs, each selected event contains only one retained candidate.

Backgrounds from partially reconstructed decays such as \({{B} _{c} ^+} \!\rightarrow {{\rho } ^+} {\mu ^+\mu ^-} \) (with \({\rho } ^+\) decaying into \({{\pi } ^+} {{\pi } ^0} \)) for the nonresonant signal mode, \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\rho } ^+} \) for the normalisation mode, or \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\rho } ^+} \) for the signal \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) mode, have a reconstructed \({B} _{c} ^+\) candidate invariant mass that lies more than \(100\,\text {Me V} \) below the known \({B} _{c} ^+\)  mass [21]. These sources of background predominantly populate a region outside, but have a tail that extends into, the fit range used in the analysis. This contribution is neglected in the fits to the nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) candidates, but is accounted for in the resonant mode fits. Backgrounds with a missing neutrino or two or more missing massive particles can also be a source of partially reconstructed background, but their contributions are negligible in the fit range.

Due to the imperfect pion-kaon separation, misidentified backgrounds can in principle arise from \({{B} _{c} ^+} \!\rightarrow {{K} ^+} {\mu ^+\mu ^-} \) decays for the nonresonant signal mode, \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{K} ^+} \) for the \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) mode, or \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{K} ^+} \) for the \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) mode. However, the branching fractions for these sources of background are Cabibbo-suppressed with respect to the signal decays. For instance, the ratio of the \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{K} ^+} \) and \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) branching fractions has been measured to be \(0.079\pm 0.007 \pm 0.003\) [52]. These background contributions are further suppressed by the particle identification requirements. Due to its estimated size, the Cabibbo-suppressed background is neglected for the search for \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays. The corresponding background sources are, however, included in the resonant signal mode fits.

Contributions from hadronic backgrounds such as \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {{\pi } ^-} {{\pi } ^+} \) decays, where two pions are mistakenly identified as muons are expected to be negligible based on the misidentification rates estimated from simulation and the fact that these decays – which have not yet been observed – are expected to occur via suppressed annihilation processes. For the search for the nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decay, it is in principle possible to receive contributions from the resonant \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) or \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) decays, where the \({\pi } ^+\) is mistakenly identified as a \(\mu ^+\) and vice versa. These sources of background are studied using simulation and data and found to be negligible after applying the selection requirements.

4 Invariant-mass fits

The yields for nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays, resonant \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) decays, and resonant \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) decays are determined from one-dimensional extended unbinned maximum-likelihood fits to the \(m({{\pi } ^+} {\mu ^+\mu ^-})\) distribution of the selected candidates in the respective dimuon invariant-mass intervals (see Table 1). For the resonant decays, the fits are performed independently to the \({B} _{c} ^+\)  candidates fulfilling the \({{\pi } ^+} {\mu ^+\mu ^-} \) WP requirements and for \({B} _{c} ^+\)  candidates fulfilling the \({\psi {(2S)}} {{\pi } ^+} \) WP requirements (see Sect. 3). The fits at the \({{\pi } ^+} {\mu ^+\mu ^-} \) WP are performed to the \({{\pi } ^+} {\mu ^+\mu ^-} \) invariant-mass distribution without additional constraints. The fits at the \({\psi {(2S)}} {{\pi } ^+} \) WP are performed to the \({B} _{c} ^+\) candidate mass distribution after constraining the dimuon mass to the known \({J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }\) or \(\psi {(2S)}\) mass [21], as appropriate.

For the search for the nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decay, fits are performed independently for each of the four \(q^2\) intervals and for the combination of the four intervals. The fit model includes two components: signal decays and combinatorial background. The signal model is validated using simulation and corresponds to the sum of two Gaussian functions, one with power-law tails on both sides of the distribution [53]. The tail parameters are fixed to the values obtained from simulation. The signal model includes a global shift of peak position and a global scaling factor for the width of the distribution, relative to the values found in simulation. The peak offset and width scale factor are obtained from a fit to the resonant \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) decays in data. The combinatorial background is described by an exponential function, whose exponent is allowed to vary in the fit to data.

Figure 1 shows the \({B} _{c} ^+\) candidate invariant-mass distributions of selected \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) candidates, with results of the fits superimposed. For the intermediate \(q^2\) interval, the fit favours a negative signal yield due to a lack of candidates in the region close to the \({B} _{c} ^+\)  mass peak position and a feature of extended maximum-likelihood fits that prefers negative yields in such situations [54]. Table 2 summarises the yields obtained from the fits.

Fig. 1
figure 1

Reconstructed \({{\pi } ^+} {\mu ^+\mu ^-} \) invariant-mass distributions for the selected \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) candidates for each \(q^2\) interval and for all intervals combined, with results of the fit described in the text overlaid

For the resonant \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) decay, the fit model includes four components: signal \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) decays, misidentified \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{K} ^+} \) decays, partially reconstructed background from \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\rho } ^+} \) decays and combinatorial background. The signal, misidentified and partially reconstructed backgrounds are each parameterised by the sum of two Gaussian functions, one with power-law tails. The tail parameters of each distribution are fixed from simulation. The peak position and width of the distributions are allowed to vary in the fit to the data by a global offset and scale factor that is shared between the components. The combinatorial background model is an exponential function, whose exponent is allowed to vary in the fit to data. In total, the \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) fit includes seven free parameters: the yields of the four components, the global peak position shift and width scaling factor, and the exponent of the combinatorial background.

Table 2 Yields for the signal nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decay (\(N_{{{\pi } ^+} {\mu ^+\mu ^-}}\)) and combinatorial background (\(N_\textrm{comb}\)) obtained from the fits to data described in the text, with statistical uncertainties only

Figure 2 shows the distributions of selected \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) candidates for both WPs, with fit projections overlaid. Table 3 summarises the yields obtained from the fits. The ratio of \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \)  yields at the two WPs is found to be consistent with the expectation based on the efficiencies in simulation. The yields of the misidentified \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{K} ^+} \)  component at the two WPs are found to be consistent with the expectations based on the misidentification rates in simulation and the measured ratio of branching fractions \({\mathcal {B}} ({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{K} ^+} )/{\mathcal {B}} ({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} )\) [52].

Fig. 2
figure 2

Reconstructed \({{\pi } ^+} {\mu ^+\mu ^-} \) invariant-mass distributions for the selected \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) candidates at the (left) \({{\pi } ^+} {\mu ^+\mu ^-} \) and (right) \({\psi {(2S)}} {{\pi } ^+} \) working points, with the results of the fits overlaid. For the right figure, the \({{\pi } ^+} {\mu ^+\mu ^-} \) invariant mass is calculated after constraining the dimuon mass to the known \({J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }\)  mass, thereby improving the signal resolution

Similarly, for the resonant \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) decay, the fit includes four components: signal \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) decays, misidentified background from \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{K} ^+} \) decays, partially reconstructed \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\rho } ^+} \) background and combinatorial background. The analytical functions of the fit models used for the \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) fit are the same as those used for the \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) fit, but with different parameters. The signal model, and the models for the misidentified and the partially reconstructed background have tail parameters which are fixed to values obtained from simulation, and they also include a global shift of peak position and a global scaling factor for the widths of the distributions. However, the global peak position shift and width scaling factor are constrained in the fit to data to be consistent with the values obtained from the \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) fit. Thus, the \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) fit includes five free parameters: the yields for the four components and the exponent of the combinatorial background. The \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) fit is performed to the \({B} _{c} ^+\) mass distribution after constraining the dimuon mass to the known \(\psi {(2S)}\)  mass. Figure 3 shows the distribution of selected \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) candidates, with fit projections overlaid. Table 4 summarises the yields obtained from the fit.

5 Efficiencies and systematic uncertainties

The efficiency ratios between signal and normalisation modes are obtained from simulation accounting for the geometrical acceptance of the detector as well as effects related to the triggering, reconstruction and selection of the \({B} _{c} ^+\)  candidates. Table 5 lists the efficiency ratios between signal and normalisation modes. For the nonresonant signal mode, the efficiency denominator includes only the generated events in the respective \(q^2\) interval. The uncertainties on the efficiency ratios take into account the simulation sample size, uncertainties on the weights applied to the simulation, the matching between reconstructed and generated particles in the simulation, variations of the software trigger requirements, and the uncertainty on the known \({B} _{c} ^+\) lifetime. All variations are made consistently for the signal and normalisation modes to avoid overestimation of the uncertainty on the efficiency ratio. For the nonresonant signal mode, the impact of the signal decay model assumed in the simulation is additionally considered.

Table 3 Yields obtained from the \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) fits to data at the two working points as described in the text, with statistical uncertainties only
Fig. 3
figure 3

Reconstructed \({{\pi } ^+} {\mu ^+\mu ^-} \) invariant-mass distribution, calculated after constraining the dimuon mass to the known \(\psi {(2S)}\)  mass, for the selected \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) candidates, with results of the fit described in the text overlaid

Table 4 Yields obtained from the \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) fit to data described in the text, with statistical uncertainties only
Table 5 Efficiency ratios between signal and normalisation modes. The values of \(\varepsilon _{{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}/\varepsilon _{{{\pi } ^+} {\mu ^+\mu ^-}}\) are provided for each \(q^2\) interval and for all intervals combined. The uncertainty on the ratio combines statistical and systematic effects

The systematic uncertainties associated with the weights are evaluated by varying all weights within their uncertainties and by varying the binning scheme used to estimate them. The systematic uncertainty associated with the multivariate weighting algorithm (see Sect. 2) is evaluated by comparing the results obtained with the default and with an alternative algorithm. The default algorithm is trained to correct for discrepancies between data and simulation associated with the event track multiplicity and with the \(p_{\textrm{T}}\) and the vertex quality of the \({B} _{c} ^+\)  candidates. The alternative algorithm is trained using the IPs of the two muons as additional inputs.

The systematic uncertainty associated with the matching between reconstructed and generated particles in the simulation is evaluated by comparing the efficiencies obtained including or excluding \(B \)  candidates for which one or more decay products are not correctly matched. The systematic uncertainty associated with variations of the software trigger requirements that are not mirrored by the simulation is evaluated by comparing the efficiencies obtained by applying the tightest thresholds and by applying average thresholds within each data-taking period. The systematic uncertainty associated with the \({B} _{c} ^+\) lifetime is evaluated by varying the \({B} _{c} ^+\) lifetime in simulation within its uncertainties [21].

The nonresonant signal decays are simulated with a phase-space distribution of the final-state particles. However, the true \(q^2\) distribution of \({{B} _{c} ^+} \rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays is unknown, and there is no clear theoretical guidance for its expectation. A systematic uncertainty is assigned due to the change of efficiency over \(q^2\), shown in Fig. 4. The uncertainty is taken to be the root-mean-square variation of the distribution of the efficiency values in narrow \(q^2\) bins within each interval. The approach of determining the differential branching fraction within \(q^2\) bins helps to reduce this uncertainty.

A similar uncertainty arises due to the different possible polarisations in the dimuon system, and is illustrated by the different coloured distributions in Fig. 4. The two extreme possibilities correspond to the dimuon system forming a scalar state, which is unpolarised, and a vector state that is longitudinally polarised. The two cases are characterised by different \(\cos {\theta _l}\) distributions, where the helicity angle \(\theta _l\) is defined as the angle between the \(\mu ^+\)  momentum and the opposite of the \({{B} ^+} \)  momentum in the dimuon frame: the scalar dimuon state corresponds to a flat \(\cos {\theta _l}\) distribution, while the vector dimuon state corresponds to a \(\frac{3}{4}\sin ^2 {\theta _l}\) distribution. The difference in efficiency between the two extreme cases is taken as the associated systematic uncertainty.

For the resonant modes, the effect of the multivariate weighting algorithm dominates the systematic uncertainty on the efficiency ratio, while for the nonresonant signal mode, the systematic uncertainty associated with efficiency variation within the \(q^2\) intervals dominates the uncertainty on the efficiency ratio. The remaining systematic uncertainties cancel out almost fully in the determination of the efficiency ratios. For all measured quantities, the systematic uncertainties are smaller than the statistical uncertainties.

Fig. 4
figure 4

Efficiency ratio between the \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) and nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decay modes as a function of \(q^2\), evaluated for the two extreme possibilities of the dimuon system forming a scalar state, which is unpolarised, and a vector state that is longitudinally polarised. The shaded \(q^2\) intervals, which contain the contributions from the \({J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }\) and \(\psi {(2S)}\) resonances, are not used in the analysis of nonresonant decays

The yields obtained from the fits described in the previous section can be affected by the fit model choice and by the assumption of the polarisation of the partially reconstructed backgrounds. To study the effect of the fit model choice, each fit is performed in two configurations: using the baseline fit model and using an alternative fit model. In the alternative fit model, the analytical function used for the signal and the misidentified background models is replaced with the sum of two Gaussian functions, one with a power-law tail on the left side of the distribution, and the other with a power-law tail on the right. The analytical function for the partially reconstructed backgrounds is also replaced, but with the sum of two Gaussian functions. In all cases the difference between the results obtained with the baseline and alternative models is found to be negligible.

For the fits to nonresonant signal candidates, the possible contribution from partially reconstructed \({{B} _{c} ^+} \!\rightarrow {{\rho } ^+} {\mu ^+\mu ^-} \) decays is found to be negligible by performing fits to sideband data of background-only models including a \({{B} _{c} ^+} \!\rightarrow {{\rho } ^+} {\mu ^+\mu ^-} \) background component. The sideband data includes \({B} _{c} ^+\)  candidates in the fit range excluding those in the signal region (see Sect. 3). For the resonant modes, the \({\rho } ^+\)  meson is assumed to be unpolarised. However, the polarisation of the \({\rho } ^+\)  meson can affect the momentum of the missing pion and hence the \({B} _{c} ^+\) candidate mass shape of the partially reconstructed backgrounds. To study the effect of the \({\rho } ^+\)  polarisation, the fits are repeated assuming either full longitudinal or full transverse \({\rho } ^+\)  polarisation. The difference in the results for the two configurations is found to be negligible.

6 Results

For the nonresonant signal mode, no signal is observed over the background-only hypothesis and an upper limit is set on \(R_{{{\pi } ^+} {\mu ^+\mu ^-}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}\) in each \(q^2\) bin. The ratio \(R_{{{\pi } ^+} {\mu ^+\mu ^-}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}\) is obtained by parametrising the signal yield \(N_{{{\pi } ^+} {\mu ^+\mu ^-}}\) in the fits described in Sect. 4 using Eq. (1). The branching fraction \({\mathcal {B}} \left( {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} \!\rightarrow {\mu ^+\mu ^-} \right) \) [21], the efficiency ratio \(\varepsilon _{{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}/\varepsilon _{{{\pi } ^+} {\mu ^+\mu ^-}}\), and the normalisation yield \(N_{{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}\) are allowed to vary within Gaussian constraints to account for the uncertainties on these inputs.

Fig. 5
figure 5

Confidence belts generated using pseudoexperiments according to the Feldman–Cousins prescription [55] for each \(q^2\) interval and for all intervals combined. The vertical dashed line shows the central value from the fit to data

Upper limits on the branching fraction ratios are obtained following the Feldmann–Cousins prescription [55]: pseudoexperiments are generated for various values of \(R_{{{\pi } ^+} {\mu ^+\mu ^-}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}\) and the resulting distribution of measured \(R_{{{\pi } ^+} {\mu ^+\mu ^-}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}\) is used to form confidence belts. Figure 5 shows confidence belts at \(90\%\) and \(95\%\) confidence level (CL). Table 6 gives the results for \(R_{{{\pi } ^+} {\mu ^+\mu ^-}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}\) and the obtained limits. To assess the impact of the systematic uncertainties, the fits are repeated fixing the nuisance parameters to their central values. Figure 6 summarises the obtained limits on the normalised differential branching fraction. As further checks the procedure is repeated restricting the signal yield to positive values, or performing nonextended maximum-likelihood fits. No significant changes in the obtained upper limits are found.

Table 6 Results for \(R_{{{\pi } ^+} {\mu ^+\mu ^-}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}\), where the first uncertainties are statistical and the second are systematic, together with upper limits (ULs) at \(90\%\) and \(95\%\) CL

For the resonant modes, the yields obtained from the fits and the efficiency ratio are used to calculate the fraction

$$\begin{aligned} F_{{\psi {(2S)}}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }}} = \frac{N_{{\psi {(2S)}} {{\pi } ^+}}}{N_{{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}} \cdot \frac{\varepsilon _{{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+}}}{\varepsilon _{{\psi {(2S)}} {{\pi } ^+}}}\,\text {,} \end{aligned}$$

which is translated into the ratio of branching fractions

$$\begin{aligned} R_{{\psi {(2S)}}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }}}\equiv & {} \frac{{\mathcal {B}} ({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} )}{{\mathcal {B}} ({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} )} \\= & {} F_{{\psi {(2S)}}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }}} \cdot \frac{{\mathcal {B}} ({{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} \!\rightarrow {\mu ^+\mu ^-} )}{{\mathcal {B}} ({\psi {(2S)}} \!\rightarrow {\mu ^+\mu ^-} )} \,\text {.} \end{aligned}$$

The branching fractions of the \({{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} \!\rightarrow {e ^+} {e ^-} \) and \({\psi {(2S)}} \!\rightarrow {e ^+} {e ^-} \) decays are used in place of \({\mathcal {B}} ({{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} \!\rightarrow {\mu ^+\mu ^-} )\) and \({\mathcal {B}} ({\psi {(2S)}} \!\rightarrow {\mu ^+\mu ^-} )\) because the branching fraction of electronic \(\psi {(2S)}\) decays is more precisely measured than its muonic counterpart and the decay rates into \({e ^+} {e ^-} \) and \(\mu ^+\mu ^-\) states are expected to be identical, up to negligible mass-dependent corrections, due to the universality of the electroweak couplings to charged leptons. The results are

$$\begin{aligned} F_{{\psi {(2S)}}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }}}= & {} 0.0337\pm 0.0024\,\text {(stat)} \pm 0.0004\,\text {(syst)} \text {,} \\ R_{{\psi {(2S)}}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }}}= & {} 0.254\pm 0.018 \,\text {(stat)} \pm 0.003 \,\text {(syst)} \\{} & {} \pm 0.005\,(\textrm{BF}), \end{aligned}$$

where for \(R_{{\psi {(2S)}}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }}}\) the third uncertainty is due to limited precision of the known branching fractions for \({{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} \) and \({\psi {(2S)}} \) leptonic decays [21]. The results are consistent with, but more precise than, those obtained in previous analyses [19, 20]. As a cross check the values of \(F_{{\psi {(2S)}}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }}}\) and \(R_{{\psi {(2S)}}/{{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }}}\) are determined at the \({{\pi } ^+} {\mu ^+\mu ^-} \) WP, obtaining consistent results.

7 Summary

A search for nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays is performed together with an updated measurement of the ratio of the \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) and \({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} \) branching fractions. The analysis uses proton–proton collision data collected with the LHCb detector between 2011 and 2018, corresponding to an integrated luminosity of 9\(\,\text {fb} ^{-1}\). No evidence for an excess of signal events over background is observed for nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays and an upper limit is set on the branching fraction ratio

$$\begin{aligned} \frac{{\mathcal {B}} ({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} )}{{\mathcal {B}} ({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} )} < 2.1\times 10^{-4}\text {,} \end{aligned}$$

at \(90\%\) confidence level. This is the first limit on \({B} _{c} ^+\)  decays mediated only by annihilation diagrams into a semileptonic final state. For the resonant \({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} \) mode, the branching fraction ratio is measured to be

$$\begin{aligned} \frac{{\mathcal {B}} ({{B} _{c} ^+} \!\rightarrow {\psi {(2S)}} {{\pi } ^+} )}{{\mathcal {B}} ({{B} _{c} ^+} \!\rightarrow {{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} {{\pi } ^+} )}= & {} 0.254\pm 0.018\,\text {(stat)} \pm 0.003\,\text {(syst)} \\{} & {} \pm 0.005\,(\textrm{BF})\text {,} \end{aligned}$$

where the third uncertainty is due to limited precision of the known branching fractions for \({{J \hspace{-1.66656pt}/\hspace{-1.111pt}\psi }} \) and \({\psi {(2S)}} \) leptonic decays [21]. This measurement is consistent with, and supersedes, previous LHCb results on the same quantity [19, 20] and is the most precise to date.

Fig. 6
figure 6

Upper limits on the normalised differential branching fraction for nonresonant \({{B} _{c} ^+} \!\rightarrow {{\pi } ^+} {\mu ^+\mu ^-} \) decays as a function of \(q^2\). The solid lines show the results for each \(q^2\) bin, while the dashed lines show the results for all bins combined