1 Introduction

To test the Cabibbo–Kobayashi–Maskawa (CKM) sector of the Standard Model (SM), it is crucial to perform accurate measurements of the quark-mixing matrix elements. Any discrepancy among the numerous measurements of CKM matrix elements could reveal effects from new particles or forces beyond the SM. The knowledge of the magnitude of the matrix element \({V_{{u} {b}}} \) governing the strength of \({b} {\rightarrow }{u} \) transitions is key in the consistency checks of the SM and its naturally motivated extensions [1, 2].

The hadronic \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) decayFootnote 1 proceeds in the SM through the \({b} {\rightarrow }{u} \) transition as shown in Fig. 1. Its branching fraction is proportional to \(|{V_{{u} {b}}} |^{2}\),

$$\begin{aligned} {\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} ) = \Phi |{V_{{u} {b}}} |^{2} |V_{cs}|^{2}|F({{B} ^0} {\rightarrow }{{\pi } ^-})|^{2} f_{{{D} ^+_{s}}}^{2} |a_{\text {NF}}|^{2}, \end{aligned}$$

where \(\Phi \) is a phase-space factor, \(F({{B} ^0} {\rightarrow } {{\pi } ^-})\) is a form factor, \(f_{{{D} ^+_{s}}}\) is the \({D} ^+_{s} \) decay constant, \(V_{cs}\) is the CKM matrix element representing \({c} {\rightarrow } {s} \) transitions, and \(|a_{\text {NF}}|\) encapsulates non-factorisable effects. The form factor and the decay constant can be obtained from light-cone sum rules [3, 4] and lattice QCD calculations [5, 6], and since \(|V_{cs}|\) is known to be close to unity, the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) branching fraction can be used to probe the product \(|{V_{{u} {b}}} ||a_{\text {NF}}|\). The assumption of factorisation is expected to hold, i.e. \(|a_{\text {NF}}|\) is close to unity, for B meson decays into a heavy and a light meson, where the W emission of the decay corresponds to the light meson and the spectator quark forms part of the heavy meson. This is not the case for the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) decay, as shown in Fig 1, and consequently \(|a_{\text {NF}}|\) may be significantly different from unity [7].

Fig. 1
figure 1

Tree diagram of the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) decay, in which a \({B} ^0\) meson decays through the weak interaction to a \({D} ^+_{s} \) meson and a charged pion. This diagram represents the only (leading order) process contributing to this decay. Strong interaction between the \({D} ^+_{s} \) meson and the pion lead to a non-factorisable contribution to the decay amplitude

The measurement of the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) branching fraction can also be used to estimate the ratio of the amplitudes of the Cabibbo-suppressed \({{B} ^0} {\rightarrow }{{D} ^+} {{\pi } ^-} \) and the Cabibbo-favoured \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) decays,

$$\begin{aligned} r_{D\pi } = \left| \frac{A({{B} ^0} {\rightarrow }{{D} ^+} {{\pi } ^-} )}{A({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} )}\right| , \end{aligned}$$

which is necessary for the measurement of charge-parity (\(C\!P\)) asymmetries in \({{B} ^0} {\rightarrow }{{D} ^\mp } {{\pi } ^\pm } \) decays [8,9,10,11,12,13]. Assuming \(\mathrm {SU}(3)\) flavour symmetry, Eq. (2) can be written as [14, 15]

$$\begin{aligned} r_{D\pi } = \tan {\theta _{c}}\frac{f_{{{D} ^+}}}{f_{{{D} ^+_{s}}}}\sqrt{\frac{{\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} )}{{\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} )}}, \end{aligned}$$

where \(\theta _{c}\) is the Cabibbo angle and \(f_{{{D} ^+}}\) is the decay constant of the \({{D} ^+} \) meson. \(\mathrm {SU}(3)\) symmetry breaking is caused by different non-factorisable effects in in \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) and \({{B} ^0} {\rightarrow }{{D} ^+} {{\pi } ^-} \) decays.

This article presents measurements of \({\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} )\) and \(r_{D\pi }\) using proton–proton (pp) collision data collected with the LHCb detector at centre-of-mass energies of 7, 8 and 13 \(\,\text {TeV}\) corresponding to an integrated luminosity of \(5 \,\text {fb} ^{-1} \). The data samples recorded in the years 2011 and 2012 (2015 and 2016) at 7 and 8 (13) \(\,\text {TeV}\) will be referred to as Run 1 (Run 2). The \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) branching ratio is measured relative to the \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) normalisation channel, which is well measured and experimentally similar to the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) decay. The \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) (\({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \)) candidates are reconstructed via the \({{D} ^+_{s}} {\rightarrow }{{K} ^+} {{K} ^-} {{\pi } ^+} \) (\({{D} ^-} {\rightarrow }{{K} ^+} {{\pi } ^-} {{\pi } ^-} \)) decay. The branching fraction of the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) decay is determined by

$$\begin{aligned} {\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} )= & {} {\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} ) \dfrac{N_{{{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} }}{N_{{{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} }}\dfrac{\epsilon _{{{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} }}{\epsilon _{{{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} }}\nonumber \\&\times \dfrac{{\mathcal {B}} ({{D} ^-} {\rightarrow }{{K} ^+} {{\pi } ^-} {{\pi } ^-} )}{{\mathcal {B}} ({{D} ^+_{s}} {\rightarrow }{{K} ^+} {{K} ^-} {{\pi } ^+} )}, \end{aligned}$$

where \(N_\mathrm{{X}}\) denotes the selected candidate yield and \(\epsilon _\mathrm{{X}}\) the related efficiency for the decay mode X. In this measurement, extended maximum-likelihood fits to unbinned invariant mass distributions are performed in order to obtain the yields, while the efficiencies are obtained from simulated events and using calibration data samples.

The relative production of \({B} ^0_{s} \) and \({B} ^0\) mesons, described by the ratio \(f_s/f_d\) where \(f_s\) and \(f_d\) are the \({{B} ^0_{s}} \) and \({{B} ^0} \) hadronisation fractions, is shown to slightly depend on the pp collision energy [16]. The efficiency-corrected yield ratio \(\mathcal R\),

$$\begin{aligned} {\mathcal {R}} \equiv \dfrac{N_{{{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} }}{N_{{{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} }}\dfrac{\epsilon _{{{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} }}{\epsilon _{{{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} }} \propto \dfrac{f_s}{f_d}, \end{aligned}$$

is proportional to the relative production ratio and its dependence on the centre-of-mass energy is also reported here. This is measured using \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) and \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) decays. Accurate knowledge of \(f_{s}/f_{d}\) is a crucial input for every \({{B} ^0_{s}} \) branching fraction measurement, e.g. \({\mathcal {B}} ({{B} ^0_{s}} {\rightarrow } \mu ^{+} \mu ^{-})\), since it dominates in most cases the systematic uncertainty [17]. Following the method described in Ref. [18], the value of \(f_{s}/f_{d}\) can be calculated as

$$\begin{aligned} \dfrac{f_s}{f_d} = 0.982 \dfrac{\tau _{B_{d}}}{\tau _{B_s}} \dfrac{{\mathcal {R}}}{{\mathcal {N}}_{a}{\mathcal {N}}_{F}{\mathcal {N}}_{E}} \dfrac{{\mathcal {B}}({{D} ^-} {\rightarrow } {{K} ^+} {{\pi } ^-} {{\pi } ^-})}{{\mathcal {B}}({{D} ^+_{s}} {\rightarrow } {{K} ^+} {{K} ^-} {{\pi } ^+})}, \end{aligned}$$

where \({\mathcal {R}}\) is defined in Eq. (5), the numerical factor takes phase-space effects into account, \({\mathcal {N}}_{a}\) describes non-factorisable SU(3) breaking effects, \({\mathcal {N}}_{F}\) is the ratio of the form factors, \({\mathcal {N}}_{E}\) takes into account the contribution of the W-exchange diagram in the \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) decay, and \(\tau _{B_d}\ (\tau _{B_s})\) is the \({B} ^0\) (\({B} ^0_{s} \)) lifetime.

2 Detector and simulation

The LHCb detector [19, 20] is a single-arm forward spectrometer covering the pseudorapidity range \(2<\eta <5\), designed for the study of particles containing \(b \) or \(c \) quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region [21], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about \(4{\mathrm {\,Tm}}\), and three stations of silicon-strip detectors and straw drift tubes [22, 23] placed downstream of the magnet. The tracking system provides a measurement of the momentum, \(p\), of charged particles with a relative uncertainty that varies from about 0.5% below 20\(\,\text {Ge}\!{\,\text {V}}\!/c\) to 1.0% at 200\(\,\text {Ge}\!{\,\text {V}}\!/c\). The minimum distance of a track to a primary vertex (PV), the impact parameter (IP), is measured with a resolution of \((15+29/p_{\mathrm {T}})\,\upmu \text {m} \), where \(p_{\mathrm {T}}\) is the component of the momentum transverse to the beam, in \(\,\text {Ge}\!{\,\text {V}}\!/c\). Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov (RICH) detectors [24]. Hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [25].

The online event selection is performed by a trigger [26], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction.

Simulation is required to calculate geometrical, reconstruction and selection efficiencies, and to determine shapes of invariant mass distributions. In the simulation, pp collisions are generated using Pythia [27] with a specific LHCb configuration [28]. Decays of unstable particles are described by EvtGen [29], in which final-state radiation is generated using Photos [30]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [31, 32] as described in Ref. [33].

3 Selection

The \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) (\({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \)) decays are reconstructed by forming a \({{D} ^+_{s}} {\rightarrow }{{K} ^+} {{K} ^-} {{\pi } ^+} \) (\({{D} ^-} {\rightarrow }{{K} ^+} {{\pi } ^-} {{\pi } ^-} \)) candidate and combining it with an additional pion of opposite charge, referred to as the companion. The same reconstruction and selection procedure is applied to the \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) decay. For the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) decay, the invariant mass of the \({{K} ^+} {{K} ^-} \) pair is required to be within \(20\,\text {Me}\!{\,\text {V}}\!/c^2 \) of the \(\phi (1020)\) mass to select only the \({{D} ^+_{s}} {\rightarrow }\phi (1020){{\pi } ^+} \) decays, which significantly improves the signal-to-background ratio compared to other decays with a \({{K} ^+} {{K} ^-} {{\pi } ^+} \) combination in the final state. Selecting \({{D} ^+_{s}} {\rightarrow }\phi (1020){{\pi } ^+} \) decays has an efficiency of about \(40\%\).

At the hardware trigger stage, events are required to have a muon with high \(p_{\mathrm {T}}\) or a hadron, photon or electron with high transverse energy in the calorimeters. For hadrons, the transverse-energy threshold varied between 3 and 4\(\,\text {Ge~V}\) between 2011 and 2016. The software trigger requires a two-, three- or four-track secondary vertex with significant displacement from any primary pp interaction vertex (PV). At least one charged particle must have transverse momentum \(p_{\mathrm {T}} > 1.6\,\text {Ge}\!{\,\text {V}}\!/c \) and be inconsistent with originating from a PV. A multivariate algorithm [34] is used for the identification of secondary vertices consistent with the decay of a \(b \) hadron.

After the trigger selection, a preselection is applied to the reconstructed candidates to ensure good quality for the vertex of the \(b \)-hadron and \(c \)-hadron candidates comprising of tracks with large total and transverse momentum. Combinatorial background is suppressed using a gradient boosted decision tree (BDTG) algorithm [35, 36], trained on Run 1 \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) data. A set of 15 variables is used to train the BDTG classifier, the ones with highest importance in the training being the transverse momentum of the companion pion, the radial flight distance of the \({{\overline{B}} {}^0_{s}} \) and of the \({{D} ^+_{s}} \) candidates, the minimum transverse momentum of the \({{D} ^+_{s}} \) decay products and the minimum \(\chi ^2_{\text {IP}}\) of the companion and the \({{\overline{B}} {}^0_{s}} \) candidates, where \(\chi ^2_{\text {IP}}\) is defined as the difference in the vertex-fit \(\chi ^2\) of a given PV reconstructed with and without the particle under consideration. The correlation among the input variables has been studied and was found to be small. The BDTG classifier used in this measurement is described in Ref. [37].

To improve the \({B} ^0\) and \({\overline{B}} {}^0_{s} \) invariant mass resolutions, the \({D} ^+_{s} \) and \({D} ^-\) invariant masses are constrained to their known values [38]. All \({{D} ^+_{s}} {{\pi } ^-} \) (\({{D} ^-} {{\pi } ^+} \)) candidates are required to have their invariant masses, \(m({{D} ^+_{s}} {{\pi } ^-})\) (\(m({{D} ^-} {{\pi } ^+})\)), within the range \(5150\text {--}5800\) \((5000\text {--}5800) \,\text {Me}\!{\,\text {V}}\!/c^2 \) and the \({{K} ^+} {{K} ^-} {{\pi } ^+} \) (\({{K} ^+} {{\pi } ^-} {{\pi } ^-} \)) invariant mass within \(1930\text {--}2065\) \((1830\text {--}1920) \,\text {Me}\!{\,\text {V}}\!/c^2 \). The range of the \({{K} ^+} {{K} ^-} {{\pi } ^+} \) invariant mass includes a large upper sideband to model properly the combinatorial background shape, as described in Sect. 4.

To reduce the background due to misidentified final-state particles, particle identification (PID) information from the RICH detectors is used. The companion pion is required to pass a strict PID requirement to reduce the number of (\({{B} ^0} {\rightarrow }{{D} ^-} {{K} ^+} \)) decays where the kaon companion is misidentified as a pion. For \({{D} ^+_{s}} {\rightarrow }\phi (1020){{\pi } ^+} \) candidates, loose PID requirements are applied to both kaons and the pion, which imply a signal efficiency of about \(96\%\). In the case of the pion, the PID requirement is used primarily to remove protons originating from the \({{\varLambda } ^+_{c}} {\rightarrow } \phi p\) decay. Further PID requirements are applied to veto \({{\varLambda } ^0_{b}} {\rightarrow }{{\varLambda } ^+_{c}} ({\rightarrow }p{{K} ^-} {{\pi } ^+}) {{\pi } ^-} \) and \({{\overline{B}} {}^0} {\rightarrow }{{D} ^+} ({\rightarrow }{{K} ^-} {{\pi } ^+} {{\pi } ^+}){{\pi } ^-} \) and \({{\overline{\varLambda }} {}^0_{b}} {\rightarrow }{\!}{{{\overline{\varLambda }} {}^-_{c}} ({\rightarrow }{{\overline{{p}}}}{{{K} ^+}}{{{\pi } ^-}})}{{\pi } ^+} \) and \({{B} ^0_{s}} {\rightarrow }{\!} {{{D} ^-_{s}} ({\rightarrow } {{{K} ^-}}{{{K} ^+}}{{{\pi } ^-}}){{\pi } ^+}}\) events, which are misidentified as the final-state particles of \({{D} ^+_{s}} ({\rightarrow }{{K} ^+} {{K} ^-} {{\pi } ^+}){{\pi } ^-} \) and \({{D} ^-} ({\rightarrow }{{K} ^+} {{\pi } ^-} {{\pi } ^-}){{\pi } ^+} \) decays, respectively. These vetoes are applied if candidates are consistent with the above mentioned decays when a mass hypothesis is changed. The PID requirements result in \(75\%\) efficiency for \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) signal decays, which is dominated by the strict PID requirement on the companion pion, while the retention is about \(9\%\) for the misidentified background contribution.

The event selection efficiencies are calculated from simulation with the exception of the efficiency of the PID requirements which is determined using calibration data samples.

4 Signal and background parametrisation

After the full event selection, unbinned maximum-likelihood fits are performed to obtain the yields of the signal \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) and the normalisation \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) candidates. A two-dimensional fit to the \({{D} ^+_{s}} {{\pi } ^-} \) and the \({{K} ^+} {{K} ^-} {{\pi } ^+} \) invariant mass distributions is performed to determine the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) signal yield, while the yield of the normalisation channel is obtained from a fit to the \({{D} ^-} {{\pi } ^+} \) invariant mass distribution. Due to the \({{D} ^+_{s}} \) mass constraint, the correlation between \(m({{D} ^+_{s}} {{\pi } ^-})\) and \(m({{K} ^+} {{K} ^-} {{\pi } ^+})\) is found to be small, thus the two variables are factorised in the fit model [39]. The two-dimensional fit is performed in order to constrain the combinatorial background (see further in this Section for details).

The \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) decay is Cabibbo-suppressed and is therefore considerably less abundant than the Cabibbo-favoured \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) decay, which produces the same final-state particles. The \(m({{D} ^+_{s}} {{\pi } ^-})\) and \(m({{D} ^-} {{\pi } ^+})\) shapes for \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) and \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) candidates, respectively, are described by the sum of a double-sided Hypatia function [40] and a Johnson \(S_U\) function [41]. The left tail of the \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) invariant mass distribution overlaps with the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) signal peak and therefore special attention is given to the description of the lower mass range of the \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) peak, shaped by the combination of detector resolution and radiative effects. The \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) signal is described with the same model as the \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) decay, shifted by the known \({{B} ^0} \)\({{B} ^0_{s}} \) mass difference [38]. The left tail of this distribution is described by two parameters, \(a_{1}\) and \(n_{1}\), which are found to be correlated and therefore the parameter \(n_{1}\) is fixed to the value obtained from simulation, whereas \(a_{1}\) is obtained from simulated \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) and \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) events, as well as from \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) data. In the invariant mass fit to \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) candidates the common mean of the double-sided Hypatia and the Johnson \(S_U\) functions, the widths and the left-tail parameter \(a_{1}\) are left free in the fit, while this parameter is constrained in the \({{D} ^+_{s}} {{\pi } ^-} \) invariant mass distribution, as the background does not allow to determine the shape of the radiative tail reliably. All other parameters are fixed from simulation. In the \({{K} ^+} {{K} ^-} {{\pi } ^+} \) invariant mass fit a sum of two Crystal Ball functions with a common mean is used. The common mean and a scale factor for the widths are left free, while the other shape parameters are fixed from simulation.

The combinatorial background in \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) candidates is split in two components, referred to as random-\({D} ^+_{s} \) and true-\({D} ^+_{s} \). The random-\({D} ^+_{s} \) combinatorial background consists of random combinations of tracks that do not peak in the \({{K} ^+} {{K} ^-} {{\pi } ^+} \) invariant mass, while the true-\({D} ^+_{s} \) combinatorial background consists of events with a true \({D} ^+_{s} \) meson, combined with a random companion track. The upper mass range of the \({{K} ^+} {{K} ^-} {{\pi } ^+} \) candidate sample is used to account accurately for the random-\({{D} ^+_{s}} \) component, modelled with a single exponential distribution, while the true-\({{D} ^+_{s}} \) background is described by the signal shape. In the \({D} ^+_{s} \) \({\pi } ^-\) invariant mass fit, the random-\({{D} ^+_{s}} \) background is described by an exponential distribution and the true-\({{D} ^+_{s}} \) background is described by the sum of an exponential and a constant function. The exponential parameters are left free in both invariant mass fits.

The combinatorial background in the \(m({{D} ^-} {{\pi } ^+})\) fit of the normalisation channel is described by the sum of an exponential and a constant function, with the relative weight of the two functions and exponential parameter left free.

Decays where one or more final-state particles are not reconstructed are referred to as partially reconstructed backgrounds. In the \({{D} ^+_{s}} {{\pi } ^-} \) and \({{D} ^-} {{\pi } ^+} \) invariant mass fits these background contributions are described by an upward-open parabola or a parabola exhibiting a maximum, whose ranges are defined by the kinematic endpoints of the decay, which are convolved with Gaussian resolution functions, and which are known to describe decays involving a missing neutral pion or a missing photon, as defined in Ref. [42]. In the fit to the \({{K} ^+} {{K} ^-} {{\pi } ^+} \) invariant mass, the partially reconstructed background contributions are described by the signal mass shape.

The \(m({{D} ^+_{s}} {{\pi } ^-})\) fit requires two partially reconstructed background components from \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^{*+}_{s}} ({\rightarrow } {{D} ^+_{s}} \gamma /{{\pi } ^0}){{\pi } ^-} \) and \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\rho } ^-} ({\rightarrow }{{\pi } ^-} {{\pi } ^0}) \) decays. The fit model describing the \({{D} ^-} {{\pi } ^+} \) invariant mass accounts analogously for two partially reconstructed background contributions: \({{B} ^0} {\rightarrow }{{D} ^{*-}} ({\rightarrow } {{D} ^-} {{\pi } ^0}){{\pi } ^+} \) and \({{B} ^0} {\rightarrow }{{D} ^-} {{\rho } ^+} ({{\rightarrow }} {{\pi } ^+} {{\pi } ^0}) \). In the case of the \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^{*+}_{s}} {{\pi } ^-} \) background the previously mentioned upward-open parabola together with a parabola exhibiting a maximum is used to parameterise the components with \({{D} ^{*+}_{s}} {{\rightarrow }}{{D} ^+_{s}} \gamma \) and \({{D} ^{*+}_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^0} \) decays, respectively. The \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\rho } ^-} \) background is described by the upward-open parabola, to take into account the missing neutral pion. The \({{B} ^0} {\rightarrow }{{D} ^{*-}} {{\pi } ^+} \) decay uses an upward-open parabola function and exhibits a double-peaked shape. Most parameters are obtained from simulated events and fixed, aside from the relevant invariant mass shifts and widths. For the \({{B} ^0} {\rightarrow }{{D} ^-} {{\rho } ^+} \) background a single upward-open parabola function is taken, with a floating width and a floating mass shift parameter that is shared with the \({{B} ^0} {\rightarrow }{{D} ^{*-}} {{\pi } ^+} \) contribution. The widths of the partially reconstructed background contributions in the \(m({{D} ^+_{s}} {{\pi } ^-})\) fits are fixed to the values obtained from \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) candidates in data, corrected for differences between the \(m({{D} ^+_{s}} {{\pi } ^-})\) and \(m({{D} ^-} {{\pi } ^+})\) distributions, as obtained from simulation.

Fig. 2
figure 2

The invariant mass distributions of normalisation \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) candidates, for (left) Run 1 and (right) Run 2 data samples. Overlaid are the fit projections along with the signal and background contributions

The \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) candidate sample is contaminated by the \({{B} ^0_{s}} {\rightarrow } {{D} ^-_{s}} {{\pi } ^+} \), \({{\overline{\varLambda }} {}^0_{b}} {\rightarrow } {{\overline{\varLambda }} {}^-_{c}} {{\pi } ^+} \) and \({{B} ^0} {\rightarrow }{{D} ^-} {{K} ^+} \) decays, resulting from the misidentification of one or two of the final-state particles. Analogously, the , \({{\varLambda } ^0_{b}} {\rightarrow }{{\varLambda } ^+_{c}} {{\pi } ^-} \) and \({{\overline{B}} {}^0} {\rightarrow } {{D} ^+} {{\pi } ^-} \) decays are misidentified background contributions of the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) candidate sample. Their shapes are determined from simulation using a non-parametric kernel estimation method [43]. The yields of the misidentified background contributions are estimated by using known branching fractions [38] and efficiencies that are determined from simulated background decays. Each yield of a misidentified background in the fit model is constrained to be close to its estimated value and is allowed to vary within the corresponding uncertainty.

5 Signal yields

The \(m({{D} ^-} {{\pi } ^+})\) data distributions, with overlaid fit projections for the total, the \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) signal and the background components, are shown in Fig. 2. The resulting signal yields are \((4.971\pm 0.013)\times 10^5\) and \((6.294\pm 0.016)\times 10^5\) for Run 1 and Run 2 samples, respectively. The fit results are also used to constrain the left tail of the signal shape and the widths of the partially reconstructed backgrounds to the invariant mass distribution of \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) candidates.

Fig. 3
figure 3

The (top) \({{D} ^+_{s}} {{\pi } ^-} \) and (bottom) \({{K} ^+} {{K} ^-} {{\pi } ^+} \) invariant mass distributions of signal \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) candidates, for (left) Run 1 and (right) Run 2 data samples. Overlaid are the fit projections along with the signal and background contributions

The two-dimensional fit to \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) candidates is performed in the \({{D} ^+_{s}} {{\pi } ^-} \) and \({{K} ^+} {{K} ^-} {{\pi } ^+} \) invariant mass distributions. The \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) branching fraction is determined using the yields of the signal and normalisation modes, their selection efficiencies and the known \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \), \({{D} ^-} {\rightarrow }{{K} ^+} {{\pi } ^-} {{\pi } ^-} \) and \({{D} ^+_{s}} {\rightarrow }{{K} ^+} {{K} ^-} {{\pi } ^+} \) branching fractions [38]. The two-dimensional fit is performed simultaneously for Run 1 and Run 2 data samples in which the \({\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} )\) and left-tail parameter are shared. The fit results in \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) signal yields of \((8.9 \pm 0.8)\times 10^2\) and \((1.12 \pm 0.11)\times 10^3\) and \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) yields of \((3.370 \pm 0.023)\times 10^4\) and \((4.647 \pm 0.027)\times 10^4\) for Run 1 and Run 2 samples, respectively. Figure 3 shows the \({{D} ^+_{s}} {{\pi } ^-} \) invariant mass distributions together with the fit projections and background contributions overlaid. Additionally, the invariant mass fits to \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) and \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) candidates are performed simultaneously to 2011, 2012 and Run 2 data in order to study the collision energy dependence of \(f_{s}/f_{d}\), as is described in Sect. 7.

6 Systematic uncertainties

Systematic uncertainties on the \({\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} )\) measurement arise from choices in the fit model and the determination of trigger, BDT and PID efficiencies. Many possible sources of systematic uncertainty cancel in the ratio of either the yields or the efficiencies of \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) and \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) events. A summary of all the systematic uncertainties is shown in Table 1. The precision of the measurement relies mostly on the accurate modelling of the signal shape and of the partially reconstructed backgrounds.

The most critical aspect of the signal shape is the description of the left tail of the \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) signal, affecting the composition of signal and background around the \({B} ^0\) mass. The shape of the left tail was determined from \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) candidates, taking into account differences between the final states, as obtained from simulation, and was Gaussian constrained in the fit. A systematic uncertainty is assigned for the assumption of the signal shape. This is done by repeating the signal fit with a different parametrisation, i.e. the sum of a double-sided Hypatia function and a Gaussian function, which leads to a systematic uncertainty of \(5.1\%\). This parametrisation was found to be the only alternative parametrisation that satisfactorily described simulated signal candidates. Furthermore, a systematic uncertainty is assigned by fixing the mean of the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) signal shape to the result of the \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) fit, rather than shifting by the known \({{B} ^0} \)\({{B} ^0_{s}} \) mass difference. Moreover, the width of the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) signal shape is scaled by the ratio of the known \({{B} ^0} \) and \({{B} ^0_{s}} \) masses. The widths of the partially reconstructed backgrounds is varied by \(\pm 1 \,\text {Me}\!{\,\text {V}}\!/c^2 \), in order to cover the differences between data and simulation as well as the differences between the \({{D} ^+_{s}} {{\pi } ^-} \) and \({{D} ^-} {{\pi } ^+} \) invariant mass distributions. The resulting difference between the signal yields is assigned as a systematic uncertainty.

The simulated samples are corrected for an imperfect modelling of the response of the particle identification algorithms as a function of the kinematical properties of the particle, using samples of \({{D} ^{*+}} \) calibration data. A systematic uncertainty associated with the PID efficiency evaluation is assigned by varying the corrections within their uncertainties. Proton misidentification is the most difficult to control accurately from data calibration samples, as relatively little calibration data is available in the kinematic region that overlaps with the B decay products. In addition, the Cherenkov angles of photons emitted by protons and kaons are more similar than those of kaons and pions. Thus, a systematic uncertainty is estimated from the difference between the nominal signal yields and a fit where the misidentified background \({{\varLambda } ^0_{b}} {\rightarrow }{{\varLambda } ^+_{c}} {{\pi } ^-} \) decay yield is left free to vary.

The systematic uncertainty assigned to the hardware trigger efficiency takes into account a difference in detection efficiency between kaons and pions. This mostly cancels in the ratio of \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) and \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) efficiencies, but the difference of one final-state particle is sensitive to this detection asymmetry. Moreover, an uncertainty related to the reconstruction efficiency of charged particles is taken into account, which mainly arises from the uncertainty on the LHCb material and the different interaction cross-section of pions and kaons with the material [44]. Additionally, a systematic uncertainty is determined on the BDT efficiency due to the difference between simulation and data. This is determined by weighting all the BDT input variables in the simulated signal sample to the signal distributions in data, which are obtained using signal weights for each candidate using the sPlot technique [45].

Table 1 Relative systematic uncertainty \(\sigma \) on the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) branching fraction measurement

The systematic uncertainties on the collision energy dependence of the efficiency-corrected \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) and \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) yield ratios are shown in Table 2. The sources of these systematic uncertainties are the same as for the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) branching fraction. Exceptions are the uncertainties on the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) signal and the partially reconstructed backgrounds, which are found to be negligible, and the uncertainty on the charged-particle reconstruction efficiency, which cancels out in the double ratio of efficiencies.

Table 2 Relative systematic uncertainty \(\sigma \) on the ratio of the efficiency-corrected \({{\overline{B}} {}^0_{s}} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) and \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) yield ratios. The ratios \({\mathcal {R}}_{13\,\text {TeV}}/{\mathcal {R}}_{7\,\text {TeV}}\) and \({\mathcal {R}}_{13\,\text {TeV}}/{\mathcal {R}}_{8\,\text {TeV}}\) are reported together as the difference of the systematic uncertainty for 7 and 8 \(\,\text {TeV}\) is negligible

7 Results

Table 3 Results of \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) and \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) signal efficiencies and yields, as well as the branching fractions used as input for this measurement [38]

Table 3 gathers all measurements and inputs to determine the branching fraction according to Eq. (4). The branching fraction ratio of \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) and \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) decays is found to be

$$\begin{aligned} \frac{{\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} )}{{\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} )} = (7.7 \pm 0.7 \pm 0.5 \pm 0.3)\times 10^{-3} , \end{aligned}$$

where the first uncertainty is statistical, the second systematic and the third stems from knowledge of the \({{D} ^-} {\rightarrow }{{K} ^+} {{\pi } ^-} {{\pi } ^-} \) and \({{D} ^-_{s}} {\rightarrow }{{K} ^-} {{K} ^+} {{\pi } ^-} \) branching fractions.

Using the known value of \({\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} )\) [38], the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) branching fraction is found to be

$$\begin{aligned} {\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} ) = (19.4 \pm 1.8\pm 1.3 \pm 1.2)\times 10^{-6} , \end{aligned}$$

where the first uncertainty is statistical, the second systematic and the third refers to the uncertainty due to the branching fractions listed in Table 3. This result represents the most precise single measurement of \({\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} )\) to date.

The \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) branching fraction depends on both \(|a_{\text {NF}}|\) and \(|{V_{{u} {b}}} |\). Using the measurement of \({\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} )\), the product

$$\begin{aligned} |V_{ub}| |a_{\text {NF}}| = (3.14 \pm 0.20 \pm 0.25)\times 10^{-3}\end{aligned}$$

is obtained, where the first uncertainty is from the \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) branching fraction measurement and the second from the CKM and QCD parameters. The form factor \(F({{B} ^0} {\rightarrow }{{\pi } ^-})|_{q^2=m^2_{{{D} ^+_{s}}}} = 0.327\pm 0.025\) is obtained using light-cone sum rules [3, 4] and lattice QCD calculations are used for the decay constant \(f_{{{D} ^+_{s}}} = 0.2499 \pm 0.0005 \,\text {Ge~V} \) [5, 6]. A phase-space factor \(\Phi = 296.2\pm 0.8\,\text {Ge~V} ^{-2}\) is used in order to relate the branching fraction to \(|{V_{{u} {b}}} ||a_{\text {NF}}|\). Additionally, the CKM matrix element \(|V_{cs}|\) is well measured and used as an input [38]. The determination of \(|{V_{{u} {b}}} ||a_{\text {NF}}|\) can be compared to the known inclusive and exclusive determinations of \(|{V_{{u} {b}}} |\) to provide a constraint on the \(|a_{\text {NF}}|\) parameter as displayed in Fig. 4.

The branching fraction ratio of \({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} \) and \({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} \) decays can be used to determine the parameter \(r_{D\pi }\), as shown in Eq. (3). Inserting the measured branching fraction ratio \({\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^+_{s}} {{\pi } ^-} )/{\mathcal {B}} ({{B} ^0} {\rightarrow }{{D} ^-} {{\pi } ^+} )\), the tangent of \(\theta _{c}\) [38] and the fraction between the decay constants \(f_{{{D} ^+_{s}}}\) and \(f_{{{D} ^+}}\) [5, 6] into Eq. (