Reconstruction of semileptonically decaying beauty hadrons produced in high energy pp collisions

It is well known that in $b$ hadron decays with a single unreconstructible final state particle, the decay kinematics can be solved up to a quadratic ambiguity, without any knowledge of the $b$ hadron momentum. We present a method to infer the momenta of $b$ hadrons produced in hadron collider experiments using information from their reconstructed flight vectors. Our method is strictly agnostic to the decay itself, which implies that it can be validated with control samples of topologically similar decays to fully reconstructible final states. A multivariate regression algorithm based on the flight information provides a $b$ hadron momentum estimate with a resolution of around 60% which is sufficient to select the correct solution to the quadratic equation in around 70% of cases. This will improve the ability of hadron collider experiments to make differential decay rate measurements with semileptonic $b$ hadron decays.


Introduction
The study of semileptonic decays of beauty hadrons, with transitions b → νq u (q u = c, u), is of great interest since these decays offer a theoretically clean determination of the magnitudes of the CKM matrix elements V ub and V cb . The majority of studies have been restricted to B + and B 0 mesons produced by e + e − colliders operating at the Υ(4S) resonance. The presence of an unreconstructible neutrino in the final state poses an experimental challenge. However, in exclusive production of BB meson pairs in e + e − collisions at the Υ(4S) resonance, the decay kinematics of the B can be resolved by balancing against theB decay or vice versa.
Hadron collider experiments offer an enticing opportunity to make complementary studies with other b-hadron species. The busy hadronic environment and inclusive production mechanism make these studies challenging. However, there is an important advantage, which is that the b-hadrons tend to have a large Lorentz boost, especially at the forward rapidities that are covered by the LHCb experiment [1]. The measured flight vector joining the primary pp interaction vertex and the b-hadron decay vertex can be exploited to constrain the decay kinematics [2]. Decays with a single missing particle obviously have an unknown 3-momentum if an assumption is made about the mass of this particle. Two independent constraints are provided by momentum conservation transverse to the flight vector. A third is provided by the assumption of the parent b-hadron mass, though this constraint is quadratic and therefore presents two solutions. One possibility [3] is to consider b-hadrons that originate from decays of narrow excited b-hadron states. If the other decay products from these decays are reconstructed then the mass of the excited state provides a further constraint on the kinematics of the child b-hadron.
Recently, LHCb made the first observation of the decay Λ 0 b → pµ −ν µ [4], and subsequently measured the ratio |V ub |/|V cb |, exploiting lattice QCD calculations of the form factors [5]. A similar measurement with B 0 s → K − µ + ν µ is highly anticipated, given the precise lattice calculations [6], which could permit the single most precise determination of |V ub |/|V cb |. One of the challenges for a hadron collider experiment is to determine the invariant mass squared of the ν system, denoted q 2 , to permit a measurement of the differential decay rate as a function of this quantity. As a result, the LHCb measurement [4] of the Λ 0 b → pµ −ν µ decay rate is restricted to a single high q 2 region 1 . In order to suppress the contamination from decays originating outside this q 2 region, it was required that both solutions of the quadratic equation described above fall into the desired window. The work presented here should permit similar studies with improved efficiency and granularity in q 2 .
The idea is to identify variables that are correlated with the b-hadron momentum, but that are independent of the manner in which the b-hadron decays. This implies that the method can be accurately validated with fully reconstructible decays that have a similar topology to the signal. A regression based estimate of the b-hadron momentum, using these variables as input, can then be used to lift the quadratic ambiguity. The studies presented here use the example of the LHCb experiment, but the ideas should be applicable to any other current or future hadron collider experiment and several centre-of-mass energies are therefore considered.

Simulation of inclusive beauty production
The Pythia [7] event generator is used to simulate inclusive b-hadron pair production in pp collisions at three centre-of-mass energies, √ s = 7, 13, 100 TeV. Unless explicitly stated otherwise, the studies that follow are based on the 13 TeV sample. A right-handed coordinate system is defined with z along the beam axis into the detector, y vertical and x horizontal. The magnitude of the momentum of a particle is denoted P , and the component transverse to the z axis is defined as p T = P sin θ. The component of the momentum along the z axis is denoted p z . A particle has a pseudorapidity defined as η = − ln (tan(θ/2)). A particle of energy E is defined to have a rapidity of y = 1 2 ln ((E + p z )/(E − p z )). Signal b-hadron candidates are required to be produced within the range 2 < η < 5, which corresponds to the approximate kinematic acceptance of the LHCb detector [8]. Fig. 1 shows, for each of the three centre-of-mass energies under consideration, the P , p T and η distributions of the b-hadrons in the event sample. One of the first things to notice is that the p T distribution has a smaller tail than the momentum distribution. This is a feature that is exploited in this work. With increasing centre-of-mass energy, the b-hadron production tends to be at larger pseudorapidities and larger momenta, but the p T spectrum is less strongly affected.
Since the main features that we try to utilise in this study are related to the line of flight between the b-hadron production and decay vertices, which is denoted F , it is crucial  that we model the resolution in the associated variables. The x and y co-ordinates of the bhadron decay vertices are smeared by ±20 µm according to a Gaussian distribution. In the z direction a larger resolution of ±200 µm is assumed. For the production vertices we assume resolutions of ±13 µm in x and y, and ±70 µm in z. These assumptions approximately reflect the reported peformance of the LHCb VELO detector [9]. In all subsequent studies it is required that the smeared flight length is larger than 3 mm, which approximates the effect of typical trigger and analysis selections of b-hadron decays by LHCb.
In order to study possible physics analysis applications, several b-hadron decays are simulated, with a focus on B 0 s mesons which are copiously produced in hadron colliders. In order to efficiently utilise the simulated b-hadron samples, all species are considered to be B 0 s mesons for the purpose of studying their decays. This is justified by the fact that the fragmentation fractions have been measured to exhibit modest kinematic dependencies in the LHCb acceptance [10]. Exclusive decays of B 0 s mesons to D − s µ + ν µ and K − µ + ν µ are simulated with a simple phase space description, which is considered to be sufficiently accurate for the present study. The D − s mesons subsequently decay to the K + K − π − final state. In all studies involving the B 0 s decay products it is required that the charged final state particles satisfy p T > 250 MeV, P > 5 GeV, 1.9 < η < 4.9. Natural units with c = 1 are used throughout this document. For the K − µ + ν µ decay mode, tighter selection requirements are imposed. The muon (kaon) must satisfy p T > 1(0.5) GeV. As a background in the study of B 0 s → K − µ + ν µ , we simulate the decay B 0 s → (K * − → K − π 0 )µ + ν µ , in which the π 0 isn't reconstructed.

Variables that are correlated to the b momentum
We attempt to identify variables that are correlated to the b-hadron momentum, but strictly restrict to those that are independent of the b-hadron decay properties. The single most important feature that we try to exploit is apparent in Fig. 2 (left) which shows the distribution of p T versus η. The (anti-)correlation between the two variables is weak, with a coefficient of around 30%, as indicated on the figure. It is therefore possible to estimate the momentum of the b-hadron as, where θ flight is the polar angle of the flight vector, and it can be seen in Fig. 1 that p T ≈ 5 GeV in our simulated samples. This approximation should return a momentum estimate with a resolution function that resembles the p T distribution in Fig. 1. In Fig. 3 (left) the distribution of 1/sinθ flight is shown. Fig. 3 (right) shows that this variable has a near linear relation to the b-hadron momentum with a correlation coefficient of around 65%. The approximation above is degraded once it is appreciated that the charged decay products from the b-hadron must be within the acceptance of the detector. Fig. 2 (right) shows the distribution of p T versus η for simulated B 0 s → K − µ + ν µ decays that satisfy the selection cuts. This has the effect of suppressing the region of low p T and low η, thus increasing the magnitude of the correlation between these two variables by around 10%.
The flight length, | F |, of a b-hadron of mass M and decay time t can be directly related to the momentum according to, shows the distribution of | F |, in which our requirement of at least 3 mm is clearly visible. Fig. 4 (right) shows that this variable is correlated with the momentum with a coefficient of around 50%.
We consider the use of information from other reconstructed particles in the event. It is obvious that in the hypothetical case of a detector with 4π angular coverage and perfect efficiency and resolution, the b-hadron p T could be inferred from the transverse momentum   balance. Considering the LHCb detector, and the most optimistic use of all kinematic information from the reconstructible particles, we can only achieve a correlation of around 20% between the missing p T and the p T of the signal b. As an alternative, we consider the possibility to reconstruct theb-hadron that is produced in association with the signal b. Even at b(b)-quark level the naive p T balance between the b andb is spoilt by the broad bb p T spectrum. Various combinations of reconstructing the signal b and associatedb at hadron or jet level are considered. Even before considering the inefficiency of reconstructing the associatedb this approach does not seem promising.
We are left with the conclusion that there are only two pieces of information related to the b-hadron flight vector, namely 1/ sin θ flight and | F |, which are of value in an estimator of the b-hadron momentum. In the following section we utilise them in a regression algorithm.

Multivariate regression analysis
The two flight variables described in the previous section, 1/ sin θ flight and | F |, are considered in a multivariate regression analysis in order to infer the momenta of the b-hadrons. A simple least squares linear regression algorithm, as implemented in the sklearn package [11], is used. This algorithm is trained on a randomly selected subset of the simulated event sample. The independent data are used to evaluate the performance of the algorithm in estimating the b-hadron momentum from the values of the two flight variables. Fig. 5 shows the distribution of the inferred b-hadron momentum, P inf , versus the true b-hadron momentum. The correlation coefficient is around 70%. In Tab. 1, the correlation coefficients between P true and the two flight variables are listed for the three centre of mass energies and various selection requirements on the simulated B 0 s → K − µ + ν µ decays. Also listed are the correlations between P true and the inferred momentum that would be returned by the regression using only 1/ sin θ flight , which is denoted P θ inf . It can be seen that as expected these values are close to the corresponding correlations with the raw flight angle variable itself. The final column of Tab. 1 lists the correlations between P true and P inf . It can be seen that the combination of the two variables in the regression algorithm increases the correlation by around 10% compared to the more powerful angular variable alone. Hardly any dependence on the centre-of-mass energy is seen. There is a degradation of the correlations of up to 10% when applying the acceptance and selection requirements on the charged decay products of the simulated B 0 s → K − µ + ν µ decays. Fig. 6 (left) shows the distribution of (P inf −P true )/P true and the corresponding distribution for P θ inf instead of P inf . As expected the shapes of these distributions roughly resemble the underlying b-hadron p T spectrum shown in Fig. 1. In Fig. 6 (right) the corresponding profiles of the mean |P inf − P true |/P true are shown as a function of η. The resolution of P inf is around 60% and exhibits some dependence on η. It is about 10-20% improved compared to that of P θ inf which neglects the decay length information.

Physics applications
In this section, several physics applications are considered. Sect. 5.1 describes an application to the study the decay B 0 s → K − µ + ν µ . The b-hadron momentum estimate is used to resolve the quadratic ambiguity and enhance the resolution in the kinematic quantities describing the b-hadron decay. Sect. 5.2 describes an attempt to use the momentum estimate directly to define variables that distinguish between different classes of decays. As an example, we consider separating B 0  Figure 6. The left-hand figure shows the distribution of (P inf − P true )/P true . The right-hand figure shows how the profile of |P inf − P true |/P true varies with η. Both figures include the corresponding entries for P θ inf . Table 1. The coefficients of correlation between the true b-hadron momentum, and the raw flight variables and the inferred momentum from the regression. For each centre-of-mass energy, as indicated in the first column, the first row corresponds to only the basic flight length and acceptance requirements on the b-hadron. The second and third rows sequentially apply P, p T and η requirements on the charged decay products in the simulated B 0 s → K − µ + ν µ decays.

Application to the study of semileptonic b decays
As mentioned in the introduction, the kinematic properties of a decay with a single unreconstructed particle of known mass can be solved up to a quadratic ambiguity by imposing momentum balance against the visible system with respect to the flight vector, and assuming the mass of the b [2]. The vector sum of the momenta of the reconstructed decay products is referred to as the visible momentum, and is denoted P vis . The corresponding missing momentum associated with the unreconstructed decay products is denoted P miss . The visible momentum is decomposed into components transverse to and along the flight vector, The transverse component of the missing momentum, P ⊥ miss , is fixed to be equal to P ⊥ vis . Assuming that the b-hadron has a mass m, and that there is a single massless unreconstructed particle in the final state, one can derive a quadratic equation in P miss , where E vis and M vis are the visible energy and mass, respectively. This yields two solutions for the b-hadron momentum, The vertex resolution renders a fraction of decays with nonphysical negative values of r. These are excluded in the following analysis. We now consider B 0 s → K − µ + ν µ decays with the already described selection requirements. Fig. 7 shows the distribution of (P + − P true )/P true versus (P − − P true )/P true . A horizontal (vertical) band can be clearly seen for the cases in which P + (P − ) is the correct solution. The effect of the vertex smearing is clearly visible but the two bands are nevertheless well separated. One can see that an independent estimate of the momentum can help to select the correct solution even if it has a modest resolution.
We can now test how often our regression based estimate of the b-hadron momentum is closer to the correct solution. Fig. 8 shows the rate of correct choices as a function of η and q 2 . The average rate of correct solutions is around 70%. Fig. 9 shows that despite the apparent reduction in correlation between the true and inferred momentum when applying acceptance cuts (see Tab. 1), the rate of correct solutions is not strongly affected. It can also be seen that the rate of correct solutions exhibits a dependence on q 2 since the two solutions become more distinct at higher q 2 . There is a change of roughly 15% in the absolute rate over the full q 2 range.
It would be desirable to measure the differential decay rate as a function of q 2 , but this requires unfolding for the finite q 2 resolution, which will inevitably introduce associated  Figure 7. The distribution of (P + −P true )/P true versus (P − −P true )/P true in the subset of simulated B 0 s → K − µ + ν µ decays that satisfy the selection requirements as described in the text.   uncertainty. Fig. 10 compares the q 2 resolution that is obtained with a random choice of solutions versus a choice based on P inf . A useful figure of merit in unfolding problems is the bin purity. For a given bin in the true quantity, we define its purity as the fraction of entries for which the reconstructed quantity also falls into the same bin. Fig. 11 compares the q 2 bin purities for the random quadratic solution versus the best solution with our method.  In Fig. 11 (left) seven equal width bins are used over the full q 2 range, and it can be seen that our method achieves a 10-20% increase in purity. Fig. 11 (right) shows that twelve appropriately defined bins could yield the same purity as for seven bins with the random approach. Particularly narrow bins can be used in the high q 2 region.

Discrimination between different classes of semileptonic decays
In this section we consider the use of P inf to define an optimal variable for discriminating between decays with differing quantities of missing mass. We take as an example the separation of B 0 isn't reconstructed. The corrected mass variable is defined with respect to the flight vector as [12], It is heavily used in the LHCb trigger [13] to inclusively select b-hadron decays, and was the main discriminating variable that was used to extract the yield of Λ 0 b → pµ −ν µ decays in the LHCb analysis of that mode [4]. The upper row of Fig. 12 shows the M corr and M vis distributions for the two B 0 s decay modes. The lower row shows two new variables that can be computed with the help of the regression based b-hadron momentum estimate. In the first case it is assumed that the missing system has zero mass, which permits a computation of the parent b-hadron mass, denoted M inf . The distribution of this variable is shown in Fig. 12 (lower left). Alternatively, the mass of the decaying b-hadron can be assumed, and a squared missing mass estimate can be made. The distribution of this variable, denoted M 2 miss,inf , is shown in Fig 12 (lower right). The two new variables provide clear discrimination but their performance should be compared to the established M corr variable. Fig. 13 shows the efficiency of B 0 s → K * − µ + ν µ versus the efficiency B 0 s → K − µ + ν µ for a range of cuts on M corr , M vis and M inf 2 . Interestingly, M inf performs slightly better than M corr in the region of signal efficiencies in excess of 90%. However, M corr performs better in all other regions.

Application to the study of b meson mixing and production
Despite the obvious drawback of unreconstructible particles, semileptonic b-hadron decays have several notable advantages for certain production and mixing studies. Some of these benefit from the large signal yields in Cabibbo favoured semileptonic decays and/or from the fact that these decays are dominated by a single tree-level amplitude which limits direct CP violation to a negligible level. A recent review of CP violation studies with B 0 s mesons is provided by Ref. [14].
In Sect. 5.1 it is noted that the correct solution rate increases with q 2 in B 0 s → K − µ + ν µ decays because the two solutions become more distinct. For the purpose of studying mixing and production it may be possible to build upon and exploit this feature. We define the following asymmetry between the two solutions, In Fig. 14 the distribution of this variable in simulated B 0 s → D − s µ + ν µ decays is shown on the left, while on the right it can clearly be seen how the rate of correct solutions increases as a function of A ± .
In order to demonstrate the potential of our method we consider the example of making a differential measurement of the B 0 s −B 0 s production asymmetry using B 0 s → D − s µ + ν µ decays. This is highly challenging since it requires that the very fast B 0 s −B 0 s oscillations can be resolved. Semileptonic decays were recently used to make the single most precise measurement to date of the B 0 d −B 0 d mixing frequency [15], but B 0 s −B 0 s oscillations are far more susceptible to the effects of missing momentum on the decay time resolution.
Nevertheless the first observation of B 0 s −B 0 s oscillations by the CDF experiment was based on a combination of hadronic and semileptonic B 0 s decays [16], and LHCb made an observation of this phenomenon using semileptonic B 0 s decays alone [17]. In these analyses, the effect of the mis-measured momentum was corrected for on a statistical basis using a Monte-Carlo correction that scales the b-hadron momentum by a factor that depends on its visible mass. The asymmetry between the number of B 0 s → D − s µ + ν µ andB 0 s → D + s µ −ν µ decays, as a function of decay time, can be written, where ∆Γ s is the decay width difference between the eigenstates of the B 0 s −B 0 s system, a s sl is a CP violating asymmetry and A P is the asymmetry in the rate of B 0 s versusB s production. LHCb has measured [17,18] a s sl using a decay time integrated version of Eq. 5.10, in which the second term can be shown to vanish [18]. It was recently suggested to study the decay time dependence in order to simultaneously measure a s sl and A P [19] 3 . The production asymmetry is of interest in its own right and a differential study as a function of rapidity (y) would be desirable [20][21][22], but subject to the resolution in the energy and momentum of the B 0 s which enter the definition of y. It should also be noted that it would be experimentally challenging to control the feed-down from B 0 s → D * − s µ + ν µ decays for which Eq. 5.6 is spoilt by the continuous spectrum of missing mass. A statistical separation of these two components should be possible, for example using the corrected mass variable, though a fully realistic analysis is beyond the scope of the present study.
In order to study our ability to resolve the fast B 0 s oscillations, we consider the decay time dependent mixing asymmetry defined as, 11) where N mixed and N unmixed are the numbers of reconstructed decays that are tagged as mixed and unmixed, respectively. LHCb currently achieves an effective tagging power of a few percent [23] but this hypothetical study assumes perfect tagging. Fig. 15 shows the mixed and unmixed decay time distributions and A mix for various different measures of the decay time. Using the uncorrected visible momentum, the first oscillation is well resolved, but subsequent oscillations are rapidly smeared out. This behaviour is not improved when scaling the momentum by a visible mass dependent factor. If a random solution to the quadratic equation is used, then the damping of the oscillations is slower. Applying our method to select the best solution the oscillations are resolved with an amplitude of around 20% with minimal degradation at higher decay times. Fig. 15 (lower right) shows how the sensitivity to the oscillations can be enhanced by restricting to a region of larger A ± (> 0.23). 4 The next challenge is to perform this analysis in bins of rapidity while unfolding for resolution. Fig. 16 (left) shows the distribution of the visible rapidity versus the true 3 The authors also consider the application to a measurement of the CP asymmetry in D ± s meson decays. 4 It should be noted however that in the high asymmetry region it may be more challenging to disentangle the B 0 s → D − s µ + νµ and B 0 s → D * − s µ + νµ components. On the lower right, the best quadratic solution is used, and A ± > 0.23 is required.
rapidity. It can be seen in Fig. 16 (right) that the corresponding distribution with the best quadratic rapidity shows a better resolution in this variable. Fig. 17 shows the bin purities in y for different choices of bin boundaries. A comparison is made for (open markers) y defined using the visible energy and momentum and (closed markers) y defined using the best quadratic solution with our method. It is clear that the later approach will permit a finer binning. While the mixing element of the example analysis with B 0 s mesons may prove to be very challenging, the improved rapidity definition is applicable to production studies with other b-hadron species for which a more straightforward decay time integrated study is sufficient.

Conclusions
Hadron collider experiments offer a unique opportunity to study semileptonic decays of bhadron species that cannot be produced at e + e − colliders operating at the Υ(4S) resonance. The presence of unreconstructible final state particles poses a challenge at hadron colliders since the momenta of the b-hadrons are poorly defined. It is well known that in decays with a single missing particle, the decay kinematics can be reconstructed up to a quadratic ambiguity, by assuming the mass of the decaying b-hadron and imposing momentum con-  Figure 16. The distributions of (left) the visible rapidity and (right) the best quadratic rapidity, versus the true rapidity. servation with respect to the reconstructed line of flight between the production and decay vertices. An independent estimate of the b-hadron momentum would be valuable since it could resolve the quadratic ambiguity. We propose a method with which to estimate the momenta of the b-hadrons using information that is totally independent of the b-hadron decays. This exploits the fact that the flight direction and decay length of a b-hadron are well measured and are both correlated to the b-hadron momentum. A simple regression analysis yields a resolution of around 60% in the b-hadron momentum. This seemingly modest momentum estimate is sufficient to select the correct solution to the quadratic equation with an average rate of around 70%. This means that differential measurements of decay rates as a function of the invariant mass squared of the lepton-neutrino system in semileptonic decays can be made with finer binning. The method can also be applied in studies of bhadron production. As a highly challenging test case we consider the example of measuring the B 0 s −B 0 s production asymmetry in bins of rapidity, which requires the resolution of the fast B 0 s −B 0 s oscillations. Our method permits a finer binning in rapidity and improves the resolution on the B 0 s −B 0 s oscillations. The general method can be perfectly validated using control samples of fully reconstructed b-hadron decays, and it should be easy to apply to analyses since it only requires knowledge of the position of the primary pp interaction vertex and the b-hadron decay vertex.