Introduction

The Belle II experiment is located at the SuperKEKB electron–positron collider in Tsukuba, Japan, and was commissioned in 2018. The experiment is designed to perform a wide range of high-precision measurements in all fields of heavy flavor physics, in particular it will investigate the decay of B mesons [1]. For this purpose, the experiment is expected to record about 40 billion collision events each containing an \(\Upsilon (4\hbox {S})\) resonance, which at least 96% of the time decays into exactly two B mesons (a \(B \bar{B}\) pair). Each B meson decays via various intermediate states into a set of final-state particles, which are considered stable in the Belle II detector. In general, charged final-state particles are reconstructed as tracks in the central drift chamber and in the inner silicon-based vertex detectors, whereas neutral final-state particles are reconstructed as energy depositions (called clusters) in the electromagnetic calorimeter. The entire experimental setup of the detector and the collider is described in more detail in Doležal and Uno [1].

The measurement of the branching fraction of rare decays like \(B \rightarrow \tau \nu_\tau\), \(B \rightarrow K \nu \nu\) or \(B \rightarrow l \nu \gamma\), with undetectable neutrinos in their final states, is challenging. However, the second B meson in each event can be used to constrain the allowed decay chains. This general idea is known as tagging. Conceptually, each \(\Upsilon (4\hbox {S})\) event is divided into two sides: the signal-side containing the tracks and clusters compatible with the assumed signal \(\hbox {B}_{\mathrm{sig}}\) decay the physicist is interested in, e.g., a rare decay like \(B \rightarrow \tau \nu\); and the tag-side containing the remaining tracks and clusters compatible with an arbitrary \(B_{\mathrm{tag}}\) meson decay. Figure 1 depicts this situation.

Fig. 1
figure 1

Schematic overview of a \(\Upsilon (4\hbox {S})\) decay: (Left) a common tag-side decay \(B_{\mathrm{tag}}^- \rightarrow D^{0} (\rightarrow K_{\mathrm{s}}^{0} (\rightarrow \pi ^{-} \pi ^{+}) \pi ^{-} \pi ^{+}) \pi ^{-}\) and (right) a typical signal-side decay \(\hbox {B}_{\mathrm{sig}}^+ \rightarrow \tau ^{+} (\rightarrow \mu ^{+} \nu _{\mu } \bar{\nu }_{\tau }) \nu _{\tau }\). The two sides overlap spatially in the detector, therefore the assignment of a measured track to one of the sides is not known a priori

The initial four momentum of the produced \(\Upsilon (4\hbox {S})\) resonance is precisely known and no additional particles are produced in this primary interaction. Therefore, because of the relevant quantum numbers conservation, knowledge about the properties of the tag-side \(B_{\mathrm{tag}}\) meson allows one to recover information about the signal-side \(\hbox {B}_{\mathrm{sig}}\) meson which would otherwise be inaccessible. Most importantly, all reconstructed tracks and clusters which are not assigned to the \(B_{\mathrm{tag}}\) mesons must be compatible with the signal decay of interest.

Ideally, a full reconstruction of the entire event has to take all reconstructed tracks and clusters into account to attain a correct interpretation of the measured data. The Full Event Interpretation (FEI) algorithm presented in this article is a new exclusive tagging algorithm developed for the Belle II experiment, embedded in the Belle II Analysis Software Framework (basf2) [2]. The FEI automatically constructs plausible \(B_{\mathrm{tag}}\) meson decay chains compatible with the observed tracks and clusters, and calculates for each decay chain the probability of it correctly describing the true process using gradient-boosted decision trees. “Exclusive” refers to the reconstruction of a particle (here the \(B_{\mathrm{tag}}\)) assuming an explicit decay channel.

Consequently, exclusive tagging reconstructs the \(B_{\mathrm{tag}}\) independently of the \(\hbox {B}_{\mathrm{sig}}\) using either hadronic or semileptonic B meson decay channels. The decay chain of the \(B_{\mathrm{tag}}\) is explicitly reconstructed and therefore the assignment of tracks and clusters to the tag-side and signal-side is known.

In the case of a measurement of an exclusive branching fraction like \(\hbox {B}_{\mathrm{sig}}\rightarrow \tau \nu _{\tau }\), the entire decay chain of the \(\Upsilon (4\hbox {S})\) is known. As a consequence, all tracks and clusters measured by the detector should be already accounted for. In particular, the requirement of no additional tracks, besides the ones used for the reconstruction of the \(\Upsilon (4\hbox {S})\), is an extremely powerful and efficient way to remove most reducibleFootnote 1 backgrounds. This requirement is called the completeness constraint throughout this text.

In the case of a measurement of an inclusive branching fraction like \(\hbox {B}_{\mathrm{sig}}\rightarrow X_{\mathrm{u}} l \nu\), all remaining tracks and clusters, besides the ones used for the lepton l and the \(B_{\mathrm{tag}}\) meson, are identified with the \(X_{\mathrm{u}}\) system. Hence, the branching fraction can be determined without explicitly assuming a decay chain for the \(X_{\mathrm{u}}\) system.

The performance of an exclusive tagging algorithm depends on the tagging efficiency (i.e., the fraction of \(\Upsilon (4\hbox {S})\) events which can be tagged), the tag-side efficiency (i.e., the fraction of \(\Upsilon (4\hbox {S})\) events with a correct tag) and on the quality of the recovered information, which determines the tag-side purity (i.e., the fraction of the tagged \(\Upsilon (4\hbox {S})\) events with a correct tag) of the tagged events.

The exclusive tag typically provides a pure sample (i.e., purities up to 90% are possible). But, this approach suffers from a low tag-side efficiency, just a few percent, since only a tiny fraction of the B decays can be explicitly reconstructed due to the large amount of possible decay channels and their high multiplicity. The imperfect reconstruction efficiency of tracks and clusters further degrades the efficiency.

Both the quality of the recovered information and the systematic uncertainties depend on the decay channel of the \(B_{\mathrm{tag}}\), therefore we distinguish further between hadronic and semileptonic exclusive tagging.

Hadronic tagging considers only hadronic B decay chains for the tag-side [3, Section 7.4.1]. Hence, the four momentum of the \(B_{\mathrm{tag}}\) is well-known and the tagged sample is very pure. A typical hadronic B decay has a branching fraction of \(\mathcal {O}(10^{-3})\). As a consequence, hadronic tagging suffers from a low tag-side efficiency and can only be applied to a tiny fraction of the recorded events. Large combinatorics of high-multiplicity decay channels further complicate the reconstruction and require tight selection criteria.

Semileptonic tagging considers only semileptonic \(B \rightarrow D l \nu\) and \(B \rightarrow D^* l \nu\) decay channels [3, Section 7.4.2]. Due to the presence of a high-momentum lepton, these decay channels can be easily identified and the semileptonic tagging usually yields a higher tag-side efficiency compared to hadronic tagging due to the large semileptonic branching fractions. On the other hand, the semileptonic tag will miss kinematic information due to the neutrino in the final state of the decay. Hence, the sample is not as pure as in the hadronic case.

To conclude, the FEI provides a hadronic and semileptonic tag for \(B^{\pm }\) and \(B^{0}\) mesons. This enables the measurement of exclusive decays with several neutrinos and inclusive decays. In both cases, the FEI provides an explicit tag-side decay chain with an associated probability.

Methods

The FEI algorithm follows a hierarchical approach with six stages, visualized in Fig. 2. Final-state particle candidates are constructed using the reconstructed tracks and clusters, and combined to intermediate particles until the final B candidates are formed. The probability of each candidate to be correct is estimated by a multivariate classifier. A multivariate classifier maps a set of input features (e.g., the four momentum or the vertex position) to a real-valued output, which can be interpreted as a probability estimate. The multivariate classifiers are constructed by optimizing a loss function (e.g., the misclassification rate) on Monte Carlo simulated \(\Upsilon (4\hbox {S})\) events and are described later in detail.

All steps in the algorithm are configurable. Therefore, the decay channels used, the cuts employed, the choice of the input features, and hyper-parameters of the multivariate classifiers depend on the configuration. A more detailed description of the algorithm and the default configuration can be found in Keck [4] and in the following we give a brief overview over the key aspects of the algorithm.

Fig. 2
figure 2

Schematic overview of the FEI. The algorithm operates on objects identified by the reconstruction software of the Belle II detectors: charged tracks, neutral clusters and displaced vertices. In six distinct stages, these basics objects are interpreted as final-state particles (\(\hbox {e}^{+}\), \(\mu ^{+}\), \(K^{+}\), \(\pi ^{+}\), \(K_{\mathrm{L}}^{0}\), \(\gamma\)) combined to form intermediate particles (\(J/\psi\), \(\pi ^{0}\), \(K_{\mathrm{s}}^{0}\), D, \(D^*\)) and finally form the tag-side B mesons

Combination of Candidates

Charged final-state particle candidates are created from tracks assuming different particle hypotheses. Neutral final-state particle candidates are created from clusters and displaced vertices constructed by oppositely charged tracks. Each candidate can be correct (signal) or wrong (background). For instance, a track used to create a \(\pi ^{+}\) candidate can originate from a pion traversing the detector (signal), from a kaon traversing the detector (background) or originates from a random combination of hits from beam background (also background).

All candidates available at this stage are combined to intermediate particle candidates in the subsequent stages, until candidates for the desired B mesons are created. Each intermediate particle has multiple possible decay channels, which can be used to create valid candidates. For instance, a \(B^{-}\) candidate can be created by combining a \(D^{0}\) and a \(\pi ^{-}\) candidate, or by combining a \(D^{0}\), a \(\pi ^{-}\) and a \(\pi ^{0}\) candidate. The \(D^{0}\) candidate could be created from a \(K^{-}\) and a \(\pi ^{+}\), or from a \(K_{\mathrm{s}}^{0}\) and a \(\pi ^{0}\).

The FEI reconstructs more than 100 explicit decay channels, leading to \(\mathcal {O}(10000)\) distinct decay chains.

Multivariate Classification

The FEI employs multivariate classifiers to estimate the probability of each candidate to be correct, which can be used to discriminate correctly identified candidates from background. For each final-state particle and for each decay channel of an intermediate particle, a multivariate classifier is trained which estimates the signal probability that the candidate is correct. To use all available information at each stage, a network of multivariate classifiers is built, following the hierarchical structure in Fig. 2.

For instance, the classifier for the decay of \(B^{-} \rightarrow D^{0} \pi ^{-}\) would use the signal probability of the \(D^{0}\) and \(\pi ^{-}\) candidates, as input features to estimate the signal probability of the \(B^{-}\) candidate created by combining the aforementioned \(D^{0}\) and \(\pi ^{-}\) candidates.

Additional input features of the classifiers are the kinematic and vertex fit information of the candidate and its daughters. The multivariate classifiers used by the FEI are trained on Monte Carlo simulated events. The training is fully automatized and distributed using a map reduce approach [5]. Monte Carlo simulated data used to train the FEI is partitioned. At each reconstruction stage, the partitioned data is distributed to nodes where the reconstruction is performed and training datasets are produced (the mapping stage). The reduction stage consists of merging the training datasets and training multivariate classifiers with these training datasets.

The available information flows from the data provided by the detector through the intermediate candidates into the final B meson candidates, yielding a single number which can be used to distinguish correctly from incorrectly identified \(B_{\mathrm{tag}}\) mesons. The process is visualized in Fig. 2. This allows one to tune the trade-off between tag-side efficiency and tag-side purity of the algorithm by requiring a minimal signal probability. By contrast, most exclusive measurements by Belle, which used the previous FRalgorithm, chose a working point near the maximum tag-side efficiency as described in “Previous work” section.

Combinatorics

It is not feasible to consider all possible B meson candidates created by all possible combinations. The amount of possible combinations scales with the factorial in the number of tracks and clusters. This problem is known as combinatorics in high-energy physics. Furthermore, it is not worthwhile to consider all possible B meson candidates, because all of them are wrong except for two in the best-case scenario.

The FEI uses two sets of the so-called cuts. A cut is a criterion that a candidate has to fulfill to be considered further. For instance, one could demand that the beam-constrained mass of the B meson candidate is near the nominal mass \(5.28\,\hbox {GeV}\) of a B meson particle, or that a \(\mu ^{+}\) candidate has a high muon particle identification likelihood, which combines sub-detector information to identify muons.

Directly after the creation of the candidate (either from a track/cluster, or by combining other candidates), but before the application of the multivariate classifier, the FEI uses loose and fast pre-cuts to remove wrongly identified candidates (background), without losing signal. The main purpose of these cuts is to save computing time and to reduce the memory consumption. These pre-cuts are applied separately for each decay channel.

At first, a very loose fixed cut is applied on a quantity which is fast to calculate, e.g., the energy for photons, the invariant mass for D mesons, the energy released in the decay for \(D^*\) mesons, or the beam-constrained mass for hadronic B mesons. Second, the remaining candidates are ranked according to a quantity, which is fast to calculate (usually the same quantity as above is used here). Only the n (usually between 10 and 20) best candidates in each decay channel are further considered, the others are discarded. This best candidate selection ensures that each decay channel and each event receives roughly the same amount of computing time.

Next, the computationally expensive parts of the reconstruction are performed on each candidate: the matching of the reconstructed candidates to the generated particles (in case of simulated events), the vertex fitting, and the multivariate classification.

After the multivariate classifiers have estimated the signal probability of each candidate, the candidates of different decay channels can be compared. Here the FEI uses tighter post-cuts to aggressively remove incorrectly reconstructed candidates using all the available information. The main purpose of these cuts is to restrict the number of candidates per particle to a manageable number.

At first, there is a loose fixed cut on the signal probability, to remove unreasonable candidates. Second, the remaining candidates are ranked according to their signal probability. Only the m (usually between 10 and 20) best candidates of the particle (i.e., over all decay channels) are further considered, the others are discarded. This best candidate selection ensures that the amount of candidates produced in the next stage is reasonably low and can be handled by the computing system.

Performance

Applying the FEI to \(\mathcal {O}(1 \, \text {billion})\) events is a CPU-intensive task. An optimized runtime and a small memory footprint are key for a practical application and to save computing resources. The FEI spends most CPU time on vertex fitting (38%), particle combination (27%), and classifier inference (15%). All three tasks have been carefully optimized.

The FEI uses only a fast and simple unconstrained vertex fit during the reconstruction, and feeds the calculated information into its multivariate classifiers. The user can refit the whole decay chain of the final B candidates, including mass and/or interaction point profile constraints if desired. A dedicated fitter (called FastFit) based on a Kalman Filter [6] was implemented for the FEI, which requires drastically less computing time than the default implementation used by Belle II and yields very similar results. Due to this fitter, an overall speedup of the FEI of 2.74 was observed. The FastFit code is licensed under GPLv3 and available on GitHub [7].

As explained in “Combinatorics” section, the number of candidates which have to be processed scales as the factorial of the multiplicity of the channel. In previous approaches, the runtime and the maximum memory consumption was dominated by a few high-multiplicity events and tight cuts had to be applied to high-multiplicity channels. By contrast, the FEI addresses the combinatorics problem by performing best candidate selections during the reconstruction of the decay chain instead of fixed cuts. As a consequence, for each event and each decay channel, the FEI processes the same number of candidates in vertex fitting and classifier inference, i.e., consumes similar amounts of CPU time. Moreover, the maximum memory consumption is limited due to the fixed number of best candidates per event, which is a key requirement for using the computing infrastructure.

Finally, the FEI uses FastBDT [8], a gradient-boosted decision tree (BDT) implementation, as its default multivariate classification algorithm. The algorithm was originally designed for the FEI to speed up the training and application phase. Compared to other popular BDT implementations such as those provided by TMVA [9], SKLearn [10] and XGBoost [11], it originally improved the execution time by more than one order of magnitude, both in training and application. In addition, an improved classification quality was observed. Most of the time when using FastBDT is spent during the extraction of the necessary features, therefore no further significant speedups can be achieved by employing a different method.

Automatic Reporting

The FEI includes an automatic reporting system called Full Event Interpretation Report (FEIR).

The FEIR contains efficiencies and purities for all particles and decay channels at different points during the reconstruction. Individual reports containing control plots for each multivariate classifier and input variables are also automatically created. Control plots include receiver operating characteristics (ROC) curves, which show the tag-side efficiency against purity. Additionally, for each classifier, the purity is plotted as a function of classifier output, to check for a linear relationship as this confirms the classifier output can be treated as a probability. This built-in monitoring capability upgrades the FEI from a black-box to a white-box algorithm, which the user can understand and inspect on all levels of reconstruction.

Previous Work

Previous experiments have already developed and successfully employed tagging algorithms. To compare the algorithms, the maximal achievable tag-side efficiency is of particular interest, because it is directly related to the signal selection efficiency of the measurement. On the other hand, the achievable tag-side purity is only of limited use, because the achievable final purity of the final selection used for the measurement is dominated by the completeness constraint. Hence, most of the incorrect tags can be easily discarded and the final purity depends strongly on the considered signal decay channel. Moreover, signal-side independent ROC curves are not available for most of the previously employed algorithms. The area under the ROC curve allows one to compare the performance of the tagging algorithms.

The BaBar experiment [12] used the Semi-Exclusive B reconstruction (SER) algorithm for hadronic tagging [3, Section 7.4.1.1]. The algorithm used exclusive D and \(D^*\) mesons candidates as a seed, and combined those with up to five charmless hadrons to form a \(B_{\mathrm{tag}}\) without assuming an exclusive B decay mode. The tag-side efficiency and tag-side purity of each B decay chain was extracted by fitting the beam-constrained mass [3, Section 7.1.1.2] spectrum of the constructed \(B_{\mathrm{tag}}\) meson candidates. The beam-constrained mass is defined as \(M_{\mathrm{bc}} = \sqrt{E_{\mathrm{beam}}^2/c^4 - p_{B}^2/c^2}\) where \(p_{B}\) denotes the three momentum of the reconstructed B meson candidate and \(E_{\mathrm{beam}}\) denotes half of the centre-of-mass energy of the colliding electron–positron pair. The maximum hadronic tag-side efficiency achieved by this algorithm was 0.2% for \(B^{0} \bar{B}^{0}\) and 0.4% for \(B^{+} B^{-}\), with a tag-side purity around 30%. The tag-side purity could be further increased by rejecting B meson candidates from low-purity decay chains. The semileptonic tag was usually constructed by combining an exclusive D or \(D^*\) meson with a lepton. The maximum semileptonic tag-side efficiency was typically 0.3% for \(B^{0} \bar{B}^{0}\) and 0.6% for \(B^{+} B^{-}\) with an unknown tag-side purity.

The Belle experiment [13] used the so-called Full Reconstruction (FR) algorithm [14] for hadronic tagging [3, Section 7.4.1.2]. The FRintroduced an hierarchical approach, which is still used by its successor and is presented in this article (see “Methods” section). The tag-side efficiency and tag-side purity was extracted by fitting the beam-constrained mass spectrum of the constructed \(B_{\mathrm{tag}}\) meson candidates. The maximum hadronic tag-side efficiency achieved by this algorithm was 0.18% for \(B^{0} \bar{B}^{0}\) and 0.28% for \(B^{+} B^{-}\), with a tag-side purity around 10%. Multivariate classifiers [15] were used to estimate the signal probability of each candidate. The tag-side purity could be further increased by requiring a minimal signal probability. Variants of the FR were used for semileptonic tagging (see [16, 17]). The maximum semileptonic tag-side efficiency was 0.31% for \(B^{0} \bar{B}^{0}\) and 0.34% for \(B^{+} B^{-}\), with a typical tag-side purity of 5%.

Compared to the previously employed algorithms, the FEI provides a greater tagging and tag-side efficiency, with a equal or better tag-side purity. The improvements with respect to the FR can be attributed equally to the additional decay channels and the new candidate selection criteria. The reported maximum tag-side efficiencies for the previously used exclusive tagging algorithms are summarized in “Results” section, Table 1. The stated efficiencies are not directly comparable due to different selection criteria, e.g., a threshold on the beam-constrained mass or the deviation of the nominal energy from the reconstructed energy \(\varDelta E = E_{\mathrm{beam}} - E_{B}\) with \(E_{B}\) denoting the energy of the B candidate, best candidate selections, or cuts on the event shape used to suppress background from non-\(\Upsilon (4\hbox {S})\) events.

Table 1 Summary of the maximum tag-side efficiency of the Full Event Interpretation and for the previously used exclusive tagging algorithms

Results

The FEI algorithm was developed for the Belle II experiment. To quantify the improvements with respect to the previously used FR algorithm, the FEI is applied to data recorded by the Belle experiment. Simulated events and recorded data from the Belle experiment are converted into the new Belle II data format [4, Chapter 2]. This conversion tool was used to validate the entire Belle II analysis software and will be described in a separate publication [18]. The remainder of this article focuses on the results obtained for the hadronic tag on data recorded by the Belle experiment. The results for the semileptonic tag and for Belle II are based on simulated events and are only summarized briefly. A detailed validation of the entire algorithm can be found in Keck [4, Chapter 4].

Hadronic Tag

The performance of the hadronic tag provided by the FEI using simulated and recorded Belle events is studied and compared to the previously used FRalgorithm.

At first, the considered decay channels of the FEI are restricted to the set of hadronic decay channels used by the FR. The performance of the FEI to the FR are compared using the same hardware and the same simulated charged (neutral) \(B \bar{B}\) Belle events. The FEI required 33% less computing time and achieved a maximum tag-side efficiency of 0.53% (0.33%) on simulated events, which is significantly higher than the previously reported tag-side efficiencies (see “Previous work” section ). The increase in the maximum tag-side efficiency is due to the improved candidate selection criteria, in particular the best candidate selections.

Second, all decay channels of the FEI are used, including the 38 additional hadronic decay channels. The performance of the FEI to the FR using the same hardware and the same simulated charged (neutral) Belle events are then compared. The FEI required 48% more computing time and achieved a maximum tag-side efficiency of 0.76% (0.46%) on simulated events. The further increase in the maximum tag-side efficiency is due to the additional decay channels.

As mentioned before, the maximum tag-side efficiency is an important performance indicator for exclusive measurements, which can employ the completeness constraint to achieve a high final purity. The achieved maximum tag-side efficiencies are summarized in Table 1.

To validate the results for the hadronic tag obtained from the simulation study, we conducted exclusive measurements of ten different semileptonic B decay channels using the full \(\Upsilon (4\hbox {S})\) dataset recorded by Belle. The branching fractions of the considered semileptonic decay channels are well-known from independent untagged measurements. The branching fraction of those well-known decay channels is measured using the hadronic tag, taking into account all known disagreements between simulation and data, e.g., in the particle identification performance and the track reconstruction efficiency. We assume that the remaining disagreement between simulation and data is caused by the tag-side. Therefore, the ratio \(\varepsilon\) of the measured and the expected branching fraction is proportional to the ratio of the tag-side efficiency on recorded data and simulated events. Our assumption is supported by the compatibility of the extracted ratios within their uncertainties. Figure 3 summarizes the results for the ten decay channels. The ratios averaged over all control channels for the charged and neutral \(B_{\mathrm{tag}}\) mesons are

$$\begin{aligned} \varepsilon _{\mathrm{charged}}&= 0.74^{+0.014}_{-0.013} \pm 0.050 \\ \varepsilon _{\mathrm{neutral}}&= 0.86^{+0.045}_{-0.050} \pm 0.054, \end{aligned}$$

where the first uncertainty is statistical and the second systematic. The systematic uncertainties arises from the signal-side, e.g., through uncertainties on the particle identification performance or the track reconstruction efficiency.

A detailed description of the control measurements, including results for each tag and control channel, can be found in Schwab [19]. A similar study was conducted in the past for the FR by Sibidanov et al. [20], yielding a similar overall ratio of \(\varepsilon _{\mathrm{comb.}} = 0.75 \pm 0.03\). The rather large discrepancy between simulated events and recorded data is caused by the uncertainty on the branching fractions and decay models of the simulated B decay channels used for the tag-side and the large number of multivariate classifiers involved in the process.

The uncertainty on the tag-side efficiency of the FEI is one of the most important systematic uncertainties in the measurement of branching fractions of rare decays. The tag-side efficiency can be corrected using the extracted ratios. It is possible to apply this correction as a function of the tag-side decay channel and signal probability. A measurement which uses the ratios to correct the tag-side efficiency is performed relative to the considered calibration decay channels. The systematic uncertainty of the correction is given by the uncertainty of the ratios.

Fig. 3
figure 3

The ratios calculated by measuring ten semileptonic decay channels on converted Belle data using the hadronic tag. The procedure is described in Schwab [19]

To compare the hadronic tag provided by the FEI and the FRin a well-defined manner, which is independent of the signal-side, both algorithms are applied to the same set of ten million events. These events are randomly sampled from the full \(\Upsilon (4\hbox {S})\) dataset of 772 million events recorded by the Belle experiment. After the tag-side reconstruction, only B meson candidates are kept, which fulfill cuts on the beam-constrained mass of \(M_{\mathrm{bc}} > 5.24\,\hbox {GeV}\) and on the deviation of the reconstructed energy from the nominal energy of \(-0.15\,\hbox {GeV}< \varDelta E < 0.1\,\hbox {GeV}\) calculated on the candidate. In addition, a best candidate selection is performed, taking the B meson candidate with the highest signal probability in each event.

The same cuts on the beam-constrained mass \(M_{\mathrm{bc}} > 5.24\,\hbox {GeV}\) and the deviation of the reconstructed energy from the nominal energy \(-0.15\,\hbox {GeV}< \varDelta E < 0.1\,\hbox {GeV}\) were applied and only the best (i.e., the highest signal probability) B meson candidate in each event was used.

From this dataset, we determined the tag-side efficiency and tag-side purity for different cuts on the signal probability. We followed the procedure established in previous publications [3, Chapter 7.1]. For different cuts on the signal probability, extended unbinned maximum likelihood fits of the beam-constrained mass spectrum are performed. The signal peak consisting of correct \(B_{\mathrm{tag}}\) mesons is modeled with a Crystal Ball function [21], whereas the background is described using an ARGUS function [22]. The Gaussian mean of the Crystal Ball function was fixed to the B meson mass and its power law exponent was fixed to \(m = 4\) based on the expected shape obtained from Monte Carlo simulations. The location and the width of the ARGUS were fixed using the known kinematic end point of the spectrum. All other parameters: the normalization of both functions, the width of the Crystal Ball, and the remaining shape parameters of both functions were adjusted by the fit. The tag-side efficiency and tag-side purity are determined in a window of \(5.27\,\hbox {GeV}< M_{\mathrm{bc}} < 5.29\,\hbox {GeV}\) using the fitted yields of the signal and background component.

In addition, we checked for a potential peaking combinatorial background component, which would bias the results. This test was done using ten million events recorded \(60\,\hbox {MeV}\) below the \(\Upsilon (4\hbox {S})\) resonance. This dataset does not contain B mesons, hence no signal is expected. The fitted signal yields were compatible with zero.

The resulting ROC curves are shown in Figs. 4 and 5 for charged and neutral \(B_{\mathrm{tag}}\) mesons, respectively. The FEI exhibits a larger overall tag-side efficiency compared to the FR. We observe a slightly better performance for the FR than reported in Feindt et al. [14]. Both algorithms perform equally well when requiring a high tag-side purity. We suspect this is because there are only a finite number of cleanly identifiable \(B_{\mathrm{tag}}\) meson candidates and both algorithms identify them with similar performance. The results for tag-side purities above 70% cannot be extracted reliably and depend strongly on the chosen signal or background fit model. For practical applications, the low tag-side purity regions is of particular interest for exclusive measurements. The beam-constrained mass distributions corresponding to the low-purity region with about 15% tag-side purity and the high-purity region with approximatively 80% tag-side purity are shown in Figs. 6 and 7, respectively, for the charged \(B_{\mathrm{tag}}\).

The maximum tag-side efficiency on recorded data is not determinable by this method, as the fits are restricted to the best \(B_{\mathrm{tag}}\) candidates. However, a significant contribution to the improvement of the FEI compared to the FRis the increased number of provided candidates per event. A physics measurement will benefit from these additional tag-side candidates by first combining them with potential signal-side candidates, applying the completeness constraint (i.e., requiring no additional tracks in the event), and performing the best \(B_{\mathrm{tag}}\) candidate selection as the final step of the selection procedure. This procedure was successfully used by several measurements to validate the expected improvements on recorded data: [4, 19, 23].

Fig. 4
figure 4

Receiver operating characteristic of charged \(B_{\mathrm{tag}}\) mesons extracted from a fit of the beam-constrained mass on converted Belle data. The FEI outperforms the FR algorithms performance at low and high purity

Fig. 5
figure 5

Receiver operating characteristic of neutral \(B_{\mathrm{tag}}\) mesons extracted from a fit of the beam-constrained mass on converted Belle data. The FEI outperforms the FR algorithms performance at low and intermediate purity. At high purity, the tag-side efficiency cannot be extracted reliably

Fig. 6
figure 6

Beam-constrained mass distribution of charged \(B_{\mathrm{tag}}\) mesons in the low tag-side purity region on converted Belle data

Fig. 7
figure 7

Beam-constrained mass distribution of charged \(B_{\mathrm{tag}}\) mesons in the high tag-side purity region on converted Belle data

Semileptonic Tag

The performance of the semileptonic tag provided by the FEI is studied using simulated Belle events. The maximum tag-side efficiencies are summarized in Table 1. Receiver operating characteristics extracted from simulated events can be found in Keck [4]. The results obtained from simulated events, and the fact that the hadronic and semileptonic tag only share five out of six reconstruction stages, indicate a significant increase in the maximum tag-side efficiency. The semileptonic tag was successfully used by Keck [4] to determine the branching fraction of \(B \rightarrow \tau \nu _{\tau }\) on the full \(\Upsilon (4\hbox {S})\) dataset recorded by the Belle experiment, with a smaller relative statistical uncertainty than obtained previously. However, no studies with well-known calibration channels as described in Kronenbitter [24] and no signal-side independent determination of the ROCs as described in Kirchgessner [16], are available yet.

Outlook for Belle II

As the Belle II reconstruction software is still being optimized and no large recorded experimental data set was available at the time of writing, hence the final tag-side efficiency cannot be determined reliably for Belle II at this point. Preliminary results can be found in [4] which indicate a worse overall performance. This is likely due to the increased beam background caused by the higher luminosity of the collider, which does lead to additional tracks and neutral energy depositions. This additional detector activity is not yet fully rejected by the Belle II reconstruction algorithms [4] and future improvements are likely possible.

Discussion

The multivariate classifiers used by the FEI are trained on Monte Carlo simulated events. Depending on the training procedure and the type of events provided to the training, the multivariate classifiers of the FEI are optimized for different objectives.

In this article, we presented a so-called generic adaption of the FEI. The generic refers to that the FEI was trained independently of any specific signal-side using 180 million simulated \(\Upsilon (4\hbox {S})\) events. This setup optimizes the tag-side efficiency of a “generic” \(\Upsilon (4\hbox {S})\).

Other versions of the FEI exist which optimize the tag-side efficiency of specific signal events like \(B \rightarrow \tau \nu_\tau\). The so-called specific FEI is trained on the remaining tracks and clusters after a potential signal B meson was already identified. The training uses simulated \(\Upsilon (4\hbox {S})\) events and simulated signal events. As a consequence, the classifiers can be specifically trained to identify correctly reconstructed \(B_{\mathrm{tag}}\) mesons for signal events and can focus on reducing non-trivial background which is not discarded by the completeness constraint. The specific FEI was first introduced as a proof of concept by Keck [25] and used in Metzner [23].

Roughly half of the improvements with respect to the previous algorithm can be attributed to the additionally considered decay channels. Future extensions are currently investigated which use semileptonic D meson decays, baryonic decays and decays including \(\hbox {K}_{\mathrm{L}}^{0}\) particles.

It should also be noted that the FEI algorithm can be applied, with little modification, to the \(\Upsilon (4\hbox {S})\) resonance. This resonance decays into a pair of \(B^{(*)} B^{(*)}\) and \({B_{\mathrm{s}}^{0}}^{(*)} {B_{\mathrm{s}}^{0}}^{(*)}\) mesons. The powerful completeness constraint can still be applied in this situation.

Conclusion

The Full Event Interpretation is a new exclusive tagging algorithm developed for the Belle II experiment that will be used to measure a wide range of decays with a minimum of detectable information. The algorithm exploits the unique setup of B factories and significantly improves the tag-side efficiency compared to its predecessor algorithms.

The tag-side efficiency for hadronically tagged B mesons was validated and calibrated using Belle data. Furthermore, the hadronic and the semileptonic tag provided by FEI have already been used in several validation measurements [4, 19, 26] using the full \(\Upsilon (4\hbox {S})\) dataset recorded by the Belle experiment. Similar studies and measurements for Belle II are anticipated as soon as the experiment records a sufficient amount of collision events.

There are several ways that the FEI algorithm could be further refined and applied to so far unexplored applications. These will provide an exciting and fruitful area of future research.