First search for the ηc2(1D) in B decays at Belle

The first dedicated search for the ηc2(1D) is carried out using the decays B+→ ηc2(1D)K+, B0 → ηc2(1D)KS0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ {K}_S^0 $$\end{document}, B0→ ηc2(1D)π−K+, and B+→ ηc2(1D)π+KS0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ {K}_S^0 $$\end{document} with ηc2(1D) → hcγ. No significant signal is found. For the ηc2(1D) mass range between 3795 and 3845 MeV/c2, the branching-fraction upper limits are determined to be ℬ(B+→ ηc2(1D)K+) × ℬ(ηc2(1D) → hcγ) < 3.7 × 10−5, ℬ(B0→ ηc2(1D)K0) × ℬ(ηc2(1D) → hcγ) < 3.5 × 10−5, ℬ(B0→ ηc2(1D)π−K+) × ℬ(ηc2(1D) → hcγ) < 1.0 × 10−4, and ℬ(B+→ ηc2(1D)π+KS0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ {K}_S^0 $$\end{document}) × ℬ(ηc2(1D) → hcγ) < 1.1 × 10−4 at 90% C.L. The analysis is based on the 711 fb−1 data sample collected on the ϒ(4S) resonance by the Belle detector, which operated at the KEKB asymmetric-energy e+e− collider.


The Belle detector
The Belle detector is a large-solid-angle magnetic spectrometer that consists of a silicon vertex detector (SVD), a 50-layer central drift chamber (CDC), an array of aerogel threshold Cherenkov counters (ACC), a barrel-like arrangement of time-of-flight scintillation counters (TOF), and an electromagnetic calorimeter (ECL) comprised of CsI(Tl) crystals located inside a superconducting solenoid coil that provides a 1.5 T magnetic field. An iron fluxreturn located outside of the coil is instrumented to detect K 0 L mesons and to identify muons. The detector is described in detail elsewhere [19,20]. Two inner detector configurations were used. A 2.0 cm radius beampipe and a 3-layer silicon vertex detector were used for the first sample of 140 fb −1 , while a 1.5 cm radius beampipe, a 4-layer silicon detector and a small-cell inner drift chamber were used to record the remaining data [21].
We use a geant-based Monte Carlo (MC) simulation [22] to model the response of the detector, identify potential backgrounds and determine the acceptance. The MC simulation includes run-dependent detector performance variations and background conditions. Signal MC events are generated with EvtGen [23] in proportion to the relative luminosities of the different running periods.

JHEP05(2020)034
3 Event selection The η c candidates are reconstructed in ten different decay channels as described below. Inclusion of charge-conjugate modes is implied hereinafter. The reconstruction uses the Belle II analysis framework [24] with a conversion from Belle to Belle II data format [25].
All tracks are required to originate from the interaction-point region: we require dr < 0.2 cm and |dz| < 2 cm, where dr and dz are the cylindrical coordinates of the point of the closest approach of the track to the beam axis (the z axis of the laboratory reference frame coincides with the positron-beam axis).
Charged π, K mesons and protons are identified using likelihood ratios where h 1 and h 2 are the particle-identification hypotheses (π, K, or p) and L h i (i = 1, 2) are their corresponding likelihoods. The likelihoods are calculated by combining time-of-flight information from the TOF, the number of photoelectrons from the ACC and dE/dx measurements in the CDC. We require R K/π > 0.6 and R p/K < 0.9 for K candidates, R π/K > 0.1 and R p/π < 0.9 for π candidates, and R p/π > 0.6, R p/K > 0.6 for p candidates. The identification efficiency of the above requirements varies in the ranges (95.0-99.7)%, (86.9-94.6)%, and (90.2-98.3)% for π, K, and p, respectively, depending on the η c decay channel. The misidentification probability for background particles that are not a π, K, and p, varies in the ranges (30-48)%, (2.1-4.5)%, and (0.6-2.0)%, respectively.
Electron candidates are identified as CDC charged tracks that are matched to electromagnetic showers in the ECL. The track and ECL cluster matching quality, the ratio of the electromagnetic shower energy to the track momentum, the transverse shape of the shower, the ACC light yield, and the track dE/dx ionization are used in our electron identification criteria. A similar likelihood ratio is constructed: R e = L e /(L e + L h ), where L e and L h are the likelihoods for electrons and charged hadrons (π, K and p), respectively [26]. An electron veto (R e < 0.9) is imposed on π, K, and p candidates. This veto is not applied to the K 0 S and Λ daughter tracks, because they have independent selection criteria. For η c decay channels other than η c → K 0 S K 0 S π 0 and η c → ΛΛ, the electron veto rejects (2.3-17)% of the background events, while its signal efficiency is 97.5% or greater.
Photons are identified as ECL electromagnetic showers that have no associated charged tracks detected in the CDC. The shower shape is required to be consistent with that of a photon. The photon energies in the laboratory frame are required to be greater than 30 MeV. The photon energies in MC simulation are corrected to take into account the difference of resolution in data and MC. Correction factors are based on analysis of mass resolutions in the channels π 0 → γγ, η → γγ, and D * 0 → D 0 γ [27].
The π 0 and η candidates are reconstructed via their decays to two photons. The π 0 invariant mass is required to satisfy |M π 0 − m π 0 | < 15 MeV/c 2 ; the η mass is |M η − m η | < 30 MeV/c 2 . Here and elsewhere, M particle denotes the reconstructed invariant mass of the specified particle and m particle stands for its nominal mass [28]. The requirements correspond approximately to 3σ and 2.5σ mass windows around the nominal mass for the π 0 and η, respectively. The η decay to π + π − π 0 was also used in ref. [16]. Initially, this JHEP05(2020)034 channel was also reconstructed here, but it was found that including it does not improve the expected sensitivity.
Candidate V 0 particles (K 0 S and Λ) are reconstructed from pairs of oppositely charged tracks that are assumed to be π + π − and pπ − for K 0 S and Λ, respectively. We require MeV/c 2 and |M Λ − m Λ | < 10 MeV/c 2 , corresponding approximately to 5.5σ mass windows in both cases. The V 0 candidates are selected by a neural network using the following input variables: the V 0 candidate momentum, the decay angle (the angle between the momentum of a daughter track in the V 0 rest frame and the direction of the boost from the laboratory frame to the V 0 rest frame), the flight distance in the xy plane, the angle between the V 0 momentum and the direction from the interaction point to the V 0 vertex, the shortest z distance between the two daughter tracks, their radial impact parameters, and numbers of hits in the SVD and CDC. Another neural network is used to separate K 0 S and Λ candidates. The input variables for this network are the momenta and polar angles of the daughter tracks in the laboratory frame, their likelihood ratios R π/p , and the V 0 candidate invariant mass for the Λ hypothesis. The V 0 identification efficiency varies in the ranges (82.2-91.9)% and (85.1-86.0)% for K 0 S and Λ, respectively, depending on the η c and B decay channels. The misidentification probability for fake V 0 candidates is (0.5-0.8)% and (1.7-2.4)% for K 0 S 's and Λ's, respectively. The η candidates are reconstructed in the ηπ + π − decay mode. The invariant mass is chosen in the range |M η − m η | < 15 MeV/c 2 , which corresponds to a 4σ mass window.
The η c candidates are reconstructed in ten decay channels: pp, ppπ 0 , ppπ + π − , and ΛΛ. The selected η c candidates are required to satisfy |M ηc − m ηc | < 50 MeV/c 2 ; this mass-window width is about 1.6 times the intrinsic width of the η c [28].
The h c candidates are reconstructed in the channel h c → η c γ; the invariant-mass requirement is |M hc − m hc | < 50 MeV/c 2 . The channel h c → ppπ + π − was used in ref. [16], but it cannot be used here because the mass resolution is not sufficient to fully distinguish the χ c1 and h c peaks in the ppπ + π − mass spectrum; thus, this channel may contain a peaking background from the process B → ψ 2 (1D)(→ χ c1 (→ ppπ + π − )γ)K. The η c2 (1D) candidates are reconstructed in the channel η c2 (1D) → h c γ; their invariant mass is not restricted.
The B-meson candidates are reconstructed via the decay modes B + → η c2 (1D)K + , candidates are selected by their energy and the beam-energy-constrained mass. The difference of the B-meson and beam energies is defined as ∆E = i E i − E beam , where E i are the energies of the B decay products in the center-of-mass frame and E beam is the beam energy in the same frame. The beam-energy-constrained mass is defined as M bc = E 2 beam − ( i p i ) 2 , where p i are the momenta of the B decay products in the center-of-mass frame. We retain B candidates satisfying the conditions 5.2 < M bc < 5.3 GeV/c 2 and |∆E| < 0.2 GeV.
To improve the resolution on the invariant masses of mother particles and ∆E, a massconstrained fit is applied to the π 0 , η, η , h c , and B candidates. A fit with mass and vertex constraints is applied to the K 0 S and Λ candidates.

JHEP05(2020)034
In addition, the η c2 (1D) daughter γ energy is required to be greater than 120 MeV in the B rest frame. This requirement removes background from low-energy photons. The signal efficiency of this requirement is 100%, because the h c γ invariant mass is less than 3.7 GeV/c 2 for all excluded events.
To reduce continuum backgrounds, the ratio of the Fox-Wolfram moments [29] F 2 /F 0 is required to be less than 0.3. For the two-body decays B + → η c2 (1D)K + and B 0 → η c2 (1D)K 0 S , this requirement rejects between 10% to 44% of background, depending on the η c decay channel, while the signal efficiency ranges from 94.4% to 97.2%. For the three-body decays B 0 → η c2 (1D)π − K + and B + → η c2 (1D)π + K 0 S , the signal efficiency is in the range from 95.1% to 97.3% and the fraction of the background rejected is (6-28)%.

General analysis strategy and data samples
To improve the separation between the signal and background, we perform a multivariate analysis followed by a global optimization of the selection requirements. The first stages of the analysis are performed individually for each η c decay channel. They include the determination of two-dimensional (∆E, M bc ) resolution, fit to the (∆E, M bc ) distribution, and the multivariate-analysis stage. The global optimization of the selection requirements uses the results of all initial stages as its input. The resolution is used to determine the expected number of the signal events and the distribution of the background in (∆E, M bc ) is used to determine the expected number of the background events in the signal region. The data selected using the resulting channel-dependent criteria are merged into a single sample.
The experimental data are used for determination of the (∆E, M bc ) distribution, selection of the background samples for the neural network, and final fit to the selected events. During the development of the analysis procedure, the η c2 (1D) region was excluded to avoid bias of the η c2 (1D) significance. The final fit described in section 5 was performed on MC pseudoexperiments generated in accordance with the fit result without the η c2 (1D) mixed with the signal MC. The excluded region is defined by where σ ∆E = 15 MeV and σ M bc = 2.5 MeV/c 2 are the approximate resolutions in ∆E and M bc , respectively, and The requirement given by eq. (4.2) corresponds to the η c2 (1D) search region, chosen to be within 25 MeV/c 2 from the central value of 3820 MeV/c 2 . The central value is chosen taking into account the prediction that the difference of the η c2 (1D) and ψ 2 (1D) masses is small. After completion of the analysis procedure development, this requirement is no longer used for the event selection. The signal MC is used for the determination of the resolution and the selection of the signal samples for the neural network. The signal MC is generated using known information about the angular or invariant-mass distributions of the decay products if this JHEP05(2020)034 is possible; otherwise, uniform distributions are assumed. The multidimensional angular distribution is calculated for the decays B + → η c2 (1D)K + and B 0 → η c2 (1D)K 0 S using the helicity formalism assuming a pure E1 transition between the η c2 (1D) and h c . The η c decay resonant structure is also taken into account if it is known. The distributions for the channels are based on the results of a Dalitz plot analysis performed in ref. [30]. The contributions of intermediate φ resonances are taken into account for the channel K + K − K + K − based on the world-average branching fractions from ref. [28].

Resolution
The first stages of the analysis procedure are the determination of the (∆E, M bc ) resolution and fit to the (∆E, M bc ) distribution in data. These two stages are performed exactly in the same way as in ref. [16]. The resolution is parameterized by the function where F CB is an asymmetric Crystal Ball function [31], G (ij) a are asymmetric Gaussian functions, N CB , N G1 and N G2 are normalizations and x i and y i (i = 1, 2, 3) are rotated variables that are given by Here, ((∆E) 0 , (M bc ) 0 ) is the central point and α i is the rotation angle. The central point is the same for all three terms in eq. (4.3). The rotation eliminates the correlation of ∆E and M bc , allowing the use of a Crystal Ball function for the uncorrelated variable x 1 . The resolution is determined from a binned maximum likelihood fit to signal MC events. Example resolution fit results (for the channel B + → η c2 (1D)K + with η c → K + K 0 S π − ) are shown in figure 1.

Fit to the (∆E, M bc ) distribution
The (∆E, M bc ) distribution is fitted in order to determine the expected number of the background events in the signal region. The distribution is fitted using the function where N S is the number of signal events and B is the background density function that is given by where m 0 is the threshold mass, a is a rate parameter and P 3 is a two-dimensional thirdorder polynomial. Example (∆E, M bc ) fit results (for the channel The points with error bars are truth-matched signal MC, the red solid curve is the fit result, the green dashed curve is the Crystal Ball component, the blue dotted curve is the first Gaussian component, and the brown dash-dotted curve is the second Gaussian component.
The points with error bars are data, the red solid curve is the fit result, and the blue dotted curve is the background. Since there is no significant signal before the optimization of the selection requirements and for the entire η c γ mass range, the two curves almost coincide.

Multivariate analysis
To improve the separation of signal and background events, a multivariate analysis is performed for each individual η c decay channel. As in ref. [16], the multivariate analysis is performed using the multilayer perceptron (MLP) neural network implemented in the tmva library [32]. However, the details of the procedure depend on the particular B decay mode and, consequently, differ from ref. [16]. The following variables are always included in the neural network: the angle between the thrust axes of the B candidate and the remaining particles in the event, the angle between the thrust axes of all tracks and all photons in the event, the ratio of the Fox-Wolfram moments F 2 /F 0 , the B production angle (Υ(4S) helicity angle), the quality of the vertex fit of all B daughter tracks, K 0 S , and Λ to a common vertex, the h c and η c masses, and the maximal π 0 likelihoods for the η c2 (1D) and h c daughter photons combined with any other photon in the event. The π 0 likelihood is based on the energy and the polar angle of the transition-photon candidate in the laboratory frame and the π 0 invariant mass. Note that the ratio of the Fox-Wolfram moments is already required to be less than 0.3 at the event-selection stage. This requirement does not significantly modify the final selection results, because the rejection of background with F 2 /F 0 > 0.3 can also be performed by the MLP. However, it lowers the fraction of the background that needs to be rejected by the MLP, which is helpful to reduce overtraining.

JHEP05(2020)034
For the decays B + → η c2 (1D)K + and B 0 → η c2 (1D)K 0 S , the MLP also includes the η c2 (1D) helicity angle and the h c daughter-photon azimuthal angle for every η c channel. The η c2 (1D) helicity angle is defined as the angle between − p K and p hc , where p K and p hc are the momenta of K and h c in the η c2 (1D) rest frame, respectively. The azimuthal angle of the h c daughter photon is defined as the angle between the planes of the h c and η c2 (1D) daughter-photon momenta and the momenta of the η c2 (1D) daughter photon and K + or K 0 S in the h c rest frame. The definition is shown in figure 3.
For the channels η c → K + K 0 S π − , η c → K + K − π 0 , and η c → K 0 S K 0 S π 0 , two invariant masses of the η c daughter particle pairs (both Kπ combinations) are included in the neural network.
The following particle identification variables are included into the neural network if there are corresponding charged particles in the final state: the minimum likelihood ratio R K/π of the η c daughter kaons, the minimum of the two likelihood ratios R p/K , R p/π of the η c daughter protons, and R K/π for the daughter K + of the B meson (for the decays B + → η c2 (1D)K + and B 0 → η c2 (1D)π − K + ).

JHEP05(2020)034
If there is a π 0 or η in the final state, four additional variables are included: the π 0 (η) mass, the minimal energy of the π 0 (η) daughter photons in the laboratory frame, and the number of π 0 candidates that include a π 0 (η) daughter photon as one of their daughters (for each of the π 0 (η) daughter photons). If the η c has an η daughter, then the mass of the η candidate is also included in the neural network.
The training and testing signal samples are taken from the signal MC. The background sample is taken from a two-dimensional (∆E, M bc ) sideband defined as all selected events outside the signal region defined by eq. (4.1). The background sample is divided into training and testing samples.

Global optimization of the selection requirements
The global selection-requirement optimization is performed by maximizing the value where i is the channel index, N (i) sig and N (i) bg are the expected numbers of the signal and background events for the i-th channel in the signal region, respectively, and a = 3 is the target significance. This optimization method is based on ref. [33].
The signal region is defined as where R The expected number of signal events for B + → η c2 (1D)K + is calculated as where N Υ(4S) is the number of Υ(4S) events, B(η c → i) is the branching fraction of the η c decay to its i-th decay channel, SR is the reconstruction efficiency for the specific signal region SR before the requirement (v > v   [28]. The signal-regiondependent reconstruction efficiency is calculated as    where (i) R is the reconstruction efficiency, and S i is the signal probability density function for i-th η c decay channel (the integral of S i over the signal region is the efficiency of the signal region selection). The unknown branching-fraction product B(B + → η c2 (1D)K + ) × B(η c2 (1D) → h c γ) can be set to an arbitrary value because the maximum of F opt does not depend on it. The expected number of signal events for B 0 → η c2 (1D)K 0 S is calculated in a similar manner.
The expected number of background events is calculated as where (i) is the efficiency of the MLP output requirement for the background events, N η c2 (1D) region is the number of background events in the η c2 (1D) region defined by eq. (4.2), N full is the full number of background events, and B i is the background density function.
The results are shown in table 1 for the decay B + → η c2 (1D)K + , in table 2 for the decay B 0 → η c2 (1D)K 0 S , in table 3 for the combination of B + → η c2 (1D)K + and B 0 → η c2 (1D)K 0 S , in table 4 for the decay B 0 → η c2 (1D)π − K + , and in table 5 for the decay B + → η c2 (1D)π + K 0 S . To combine the decays B + → η c2 (1D)K + and B 0 → η c2 (1D)K 0 S , a separate optimization that includes all combinations of B and η c channels is performed. To estimate the expected number of signal events in the combined sample, the partial widths of the decays B + → η c2 (1D)K + and B 0 → η c2 (1D)K 0 are assumed to be the same.

JHEP05(2020)034
Channel Parameters Efficiency Events  Table 2. Results of the optimization of the selection requirements for the channel B 0 → η c2 (1D)K 0 S . The expected number of signal events is calculated assuming B(B + → η c2 (1D)K + ) × B(η c2 (1D) → h c γ) = 1.0 × 10 −5 and equal partial widths of the decays B + → η c2 (1D)K + and B 0 → η c2 (1D)K 0 . The efficiencies and expected numbers of signal and background events are calculated for the training sample. The signal-region half-axes R

Fit results in the default model
After the global optimization of the selection requirements, the selected events are merged into a single data sample. The best-candidate selection is performed for each η c channel separately by using the maximal MLP output value; the selection procedure is the same as in ref. [16]. The fraction of removed candidates is 10% to 23%, depending on the η c channel, for the two-body decays B + → η c2 (1D)K + and B 0 → η c2 (1D)K 0 S ; for the three-body decays B 0 → η c2 (1D)π − K + and B + → η c2 (1D)π + K 0 S , the fraction is (21-42)%. Multiple candidates originating from different η c decay channels are allowed, however, no events with multiple candidates are observed in the signal region for any of the signal B decays.
We perform an extended unbinned maximum likelihood fit to the data in the signal region. The η c2 (1D) is represented by the Breit-Wigner amplitude: where Γ η c2 (1D) is the η c2 (1D) width. Since the η c2 (1D) is expected to be narrower than the resolution, it is sufficient to use the constant-width parameterization given by eq. (5.1). The M hcγ distribution is fitted to the function where N η c2 (1D) is the number of signal events, R η c2 (1D) (M ) is the η c2 (1D) mass resolution that is determined from MC and parameterized as a sum of two asymmetric Gaussians, and P 2 is a second-order polynomial representing the background shape. For the channel

JHEP05(2020)034
Channel Parameters Efficiency Events  Table 3. Results of the optimization of the selection requirements for the combination of the channels B + → η c2 (1D)K + and B 0 → η c2 (1D)K 0 S . The expected number of signal events is calculated assuming B(B + → η c2 (1D)K + )×B(η c2 (1D) → h c γ) = 1.0×10 −5 and equal partial widths of the decays B + → η c2 (1D)K + and B 0 → η c2 (1D)K 0 . The efficiencies and expected numbers of signal and background events are calculated for the training sample. The signal-region half-axes R S , that has the lowest number of events, the default order of the background polynomial is reduced to 1. The η c2 (1D) width is fixed at 500 keV. The chosen default width value is approximately equal to the sum of individual partial widths predicted in ref. [12]: Γ(η c2 (1D) → h c γ) + Γ(η c2 (1D) → gg) + Γ(η c2 (1D) → η c ππ) = 458 keV. Another prediction of the η c2 (1D) width was made in ref. [14]; it is estimated to be within the range from 660 to 810 keV. The variation of the η c2 (1D) width is considered as a source of systematic uncertainty.
The fit results corresponding to the most significant peaks within the η c2 (1D) search region are shown in figure 4 for the decays B + → η c2 (1D)K + and B 0 → η c2 (1D)K 0 S , in figure 5 for their combination, and in figure 6 for the decays B 0 → η c2 (1D)π − K + and B + → η c2 (1D)π + K 0 S . The masses, yields, and local significances of the most significant
Results of the optimization of the selection requirements for the channel The expected number of signal events is calculated assuming

Channel
Parameters Efficiency Events  Table 5.
Results of the optimization of the selection requirements for the channel B + → η c2 (1D)π + K 0 S . The expected number of signal events is calculated assuming B(B 0 → η c2 (1D)π − K + )×B(η c2 (1D) → h c γ) = 1.0×10 −5 and equal partial widths of the decays B 0 → η c2 (1D)π − K + and B + → η c2 (1D)π + K 0 . The efficiencies and expected numbers of signal and background events are calculated for the training sample. The signal-region half-axes R   peaks within the search region are listed in table 6. The local significances are calculated from the difference of (−2 ln L) assuming that the mass is known (with one degree of freedom). There is no significant signal in any channel; since even the local significance does not exceed 3σ, the global significance is not calculated.

Systematic uncertainties
The systematic errors in the branching-fraction products can be subdivided into three categories: branching-fraction scale errors, resolution errors, and model errors.
The sources of the systematic uncertainty in the branching-fraction scale include overtraining (the difference between the efficiency in the training and testing samples), the error on the difference of the particle-identification requirement efficiency between the data and MC, the tracking efficiency error, the difference between the MLP efficiency for data and MC, the unknown amplitude of the B 0 → η c2 (1D)π − K + and B + → η c2 (1D)π + K 0 S decays, the number of Υ(4S) events, and the η c , h c , and Υ(4S) branching fractions. Events / 10 MeV/c Figure 6. Fit results for the channels B 0 → η c2 (1D)π − K + (left) and B + → η c2 (1D)π + K 0 S (right).

Channel
Mass, MeV/c 2 Yield Local significance 1.0σ Table 6. The masses, yields, and local significances of the most significant peaks within the search region.
The uncertainty due to the difference in the MLP efficiency between the data and MC is estimated using the decay mode B 0 → η c π − K + . This decay is reconstructed using selection criteria that are as similar as possible to the signal mode B + → η c2 (1D)K + . The MLP optimized for B + → η c2 (1D)K + is applied to the control channel. Some MLP input variables used for the signal channel are undefined for B 0 → η c π − K + , for example, the π 0 likelihood for photons. Such variables are set to constants. The ratio of the number of signal candidates in all channels after the application of MLP selection and in the channel η c → K + K 0 S π − before the application of MLP selection is extracted from a simultaneous fit to the η c mass distributions and compared to its value in MC. The control-channel events are weighted to reproduce the selection efficiencies in the signal channel. The resulting ratio of the number of η c candidates is 0.82 ± 0.10, while the ratio in MC is 0.98. The relative difference between the data and MC efficiency is 16.8% and the statistical error of its determination is 9.9%. The statistical error is also added in quadrature to the systematic uncertainty. The resulting uncertainty from the MLP selection efficiency difference in data JHEP05(2020)034 and MC is 19.5%. Since only the channel η c → K + K 0 S π − is used without the MLP selection, this error includes also the error of the branching fractions of other η c channels relative to the channel η c → K + K 0 S π − . The MLP efficiency uncertainty does not include the uncertainty caused by the difference between the data and MC in the distributions of the variables that are not defined for the channel B 0 → η c π − K + . There are six such variables: the η c2 (1D) helicity angle, the h c daughter-photon azimuthal angle, the h c and η c masses, and the π 0 likelihoods of the η c2 (1D) and h c daughter photons. The distributions of the angular variables are known assuming negligible contribution of higher multipole amplitudes. No additional systematic uncertainty for the difference of their distributions in data and MC is assigned.
The error due to the η c mass distribution uncertainty is estimated by varying the η c mass and width by ±1σ and reweighting the selected MC events; the largest resulting efficiency difference is treated as the systematic uncertainty from the η c mass distribution. The η c width uncertainty is increased up to the difference of the η c input and measured widths in the channel B 0 → η c π − K + to take into account the possible difference of the resolution.
Since the h c has a daughter photon that is not included into any kinematic fits before the calculation of the h c mass, the uncertainty in the h c mass distribution is caused mostly by the difference of the resolution in the photon energy in data and MC. This uncertainty is estimated by varying the photon energy correction [27] by ±1σ, reconstructing the MC events again using the new correction, and calculating the difference between the resulting efficiencies.
The uncertainty associated with the photon π 0 likelihoods is estimated using the decay B + → ψ(2S)(→ χ c1 (→ J/ψγ)γ)K + . A neural network consisting of only two likelihoods is used to select the events in data and MC. The number of ψ(2S) events in the data is calculated from a fit to the (χ c1 γ) invariant mass both before and after the application of the MLP selection. The difference of the efficiencies in data and MC is found to be 4.6%.
All systematic errors related to the branching-fraction scale are listed in table 7. The errors of the tracking efficiency and the difference of the particle-identification requirement efficiency depend on the h c decay channel; the values presented in table 7 are weighted averages.  Table 7. The systematic uncertainties of the branching-fraction scale.

Branching fraction
Since no significant signal is observed, a mass scan is performed over the η c2 (1D) search region with a step size of 0.5 MeV/c 2 . The confidence intervals for the branching-fraction products are calculated at each point. The resolution scaling coefficient S is measured by modifying the resolution function R η c2 (1D) : and similarly for other processes. The difference of the resolution in the data and MC is estimated using the decay B → ψ(2S)(→ χ c1 (→ J/ψγ)γ)K. This decay has two radiative transitions similar to the signal processes. The selection of the control channel is performed using a similar neural network, which has the same photon-related variables as in the signal process. After the photon resolution correction, no significant difference is observed between the resolution in the χ c1 mass in data and MC. The χ c1 mass resolution scaling coefficient is found to be (0.99±0.18) from a fit to the χ c1 mass. However, the resolution in the ψ(2S) mass in data is found to be worse than the resolution in MC. The resolution scaling coefficient determined from a fit to the χ c1 γ invariant mass distribution is (1.31 ± 0.12). Four resolution scaling coefficients are selected for analysis: the nominal resolution (S = 1.00), the scaling coefficient determined from B → ψ(2S)(→ χ c1 (→ J/ψγ)γ)K (S = 1.31), and the same result varied by ±1σ (S = 1.19 and S = 1.43).
For each of the selected scaling coefficients, several signal models are considered. They include the default model, the model without the signal, the model with a higher-order (2 for B 0 → η c2 (1D)K 0 S and 3 for other decays) background polynomial, a model with variations of the M hcγ fitting region, and a model with alternative values of the η c2 (1D) width (Γ η c2 (1D) = 0 MeV and 1 MeV).

JHEP05(2020)034
The confidence intervals are calculated for each model taking the branching-fraction scale error into account. For the channel B 0 → η c2 (1D)π − K + , the yield and its error are determined from the fit. To take the systematic error into account, the Feldman-Cousins unified confidence intervals [34] for the branching-fraction distribution are used: where B and σ B are the mean and variance of the branching-fraction distribution, respectively, k is the ratio of the branching fraction and observed number of events, σ N η c2 (1D) is the uncertainty in the η c2 (1D) yield, and σ (scale) B is the relative branching-fraction scale error determined in section 5.2 (the total error from table 7). Since the η c2 (1D) yield is determined from the fit, the model without the signal is excluded from the list of alternative models for B 0 → η c2 (1D)π − K + .
For other decays: B + → η c2 (1D)K + , B 0 → η c2 (1D)K 0 S , and B + → η c2 (1D)π + K 0 S , it is not possible to determine the yield at all masses from the fit, because the number of events is too small. There are gaps without any events, where the likelihood is a continuously increasing function of the signal yield for all allowed values of the yield (such values that the overall fitting function is positive). Thus, it is necessary to switch to event counting. The counting region is chosen to be within ±1.5(σ 1 + σ 2 )/2 from the current value of mass, where σ 1 and σ 2 are the parameters of the narrow asymmetric Gaussian used in the resolution fit. The expected number of background events is determined from the fit. The profile-likelihood-based intervals described in ref. [35] are used. The confidence-interval calculation takes into account the branching-fraction scale error.
The results of the confidence-interval determination for all resolution scaling coefficients and signal models are merged. For a specific M hcγ , the minimal lower limit and the maximal upper limit are selected. The resulting confidence intervals are shown in figure 7 for the channels B + → η c2 (1D)K + and B 0 → η c2 (1D)K 0 S and in figure 8 for the channels B 0 → η c2 (1D)π − K + and B + → η c2 (1D)π + K 0 S .