Semileptonic form factors for $B \to D^\ast\ell\nu$ at nonzero recoil from 2 + 1-flavor lattice QCD

We present the first unquenched lattice-QCD calculation of the form factors for the decay $B\rightarrow D^\ast\ell\nu$ at nonzero recoil. Our analysis includes 15 MILC ensembles with $N_f=2+1$ flavors of asqtad sea quarks, with a strange quark mass close to its physical mass. The lattice spacings range from $a\approx 0.15$ fm down to $0.045$ fm, while the ratio between the light- and the strange-quark masses ranges from 0.05 to 0.4. The valence $b$ and $c$ quarks are treated using the Wilson-clover action with the Fermilab interpretation, whereas the light sector employs asqtad staggered fermions. We extrapolate our results to the physical point in the continuum limit using rooted staggered heavy-light meson chiral perturbation theory. Then we apply a model-independent parametrization to extend the form factors to the full kinematic range. With this parametrization we perform a joint lattice-QCD/experiment fit using several experimental datasets to determine the CKM matrix element $|V_{cb}|$. We obtain $\left|V_{cb}\right| = (38.40 \pm 0.68_{\textrm{th}} \pm 0.34_{\textrm{exp}} \pm 0.18_{\textrm{EM}})\times 10^{-3}$. The first error is theoretical, the second comes from experiment and the last one includes electromagnetic and electroweak uncertainties, with an overall $\chi^2\text{/dof} = 126/84$, which illustrates the tensions between the experimental data sets, and between theory and experiment. This result is in agreement with previous exclusive determinations, but the tension with the inclusive determination remains. Finally, we integrate the differential decay rate obtained solely from lattice data to predict $R(D^\ast) = 0.265 \pm 0.013$, which confirms the current tension between theory and experiment.

Abstract We present the first unquenched lattice-QCD calculation of the form factors for the decay B → D * ν at nonzero recoil. Our analysis includes 15 MILC ensembles with N f = 2 + 1 flavors of asqtad sea quarks, with a strange quark mass close to its physical mass. The lattice spacings range from a ≈ 0.15 fm down to 0.045 fm, while the ratio between the light-and the strange-quark masses ranges from 0.05 to 0.4. The valence b and c quarks are treated using the Wilson-clover action with the Fermilab interpretation, whereas the light sector employs asqtad staggered fermions. We extrapolate our results to the physical point in the continuum limit using rooted staggered heavy-light meson chiral perturbation theory. Then we apply a modelindependent parametrization to extend the form factors to the full kinematic range. With this parametrization we perform a joint lattice-QCD/experiment fit using several experimental datasets to determine the CKM matrix element |V cb |. We obtain |V cb | = (38.40±0.68 th ± 0.34 exp ± 0.18 EM ) × 10 −3 . The first error is theoretical, the second comes from experiment and the last one includes electromagnetic and electroweak uncertainties, with an overall χ 2 /dof = 126/84, which illustrates the tensions between the experimental data a e-mail: alexv@unizar.es sets, and between theory and experiment. This result is in agreement with previous exclusive determinations, but the tension with the inclusive determination remains. Finally, we integrate the differential decay rate obtained solely from lattice data to predict R(D * ) = 0.265 ± 0.013, which confirms the current tension between theory and experiment.

Introduction
High precision tests of the standard model (SM) offer exciting possibilities for discovering new physics. In particular, the flavor sector of the SM is very rich in phenomena that can be used to explore physics beyond the standard model (BSM). Most flavor physics revolves around the Cabibbo-Kobayashi-Maskawa (CKM) matrix, which relates the mass and flavor eigenstates of the quarks. Since it is a basis transformation, the CKM matrix is constrained by unitarity, so violations of this rule could indicate the influence of new physics. Weak processes that are loop-suppressed in the SM may also expose new physics. To determine CKM matrix elements to high precision and to perform precision tests of the SM in measurements of rare decay processes, it is essential to know the strong-interaction environment in which these processes occur.
Among the CKM matrix elements, |V cb | has arguably been one of the most perplexing. There is a long-standing tension between the determination of this element via exclusive and inclusive decays. The operator product expansion (OPE) is used to analyze inclusive decay experiments measuring semileptonic decays B → X c ν, where X c represents any charmed hadron or combination of hadrons with a single c quark. On the other hand, exclusive decay experiments focus on decays with a specific charmed hadron in the final state, for example, B → D ν or B → D * ν. We expect both types of experiments to yield consistent results for |V cb |; however, there is a ∼ 3σ discrepancy between the inclusive and exclusive determinations [1,2]. Since contributions from new physics are unlikely to explain these differences [3,4], the disagreement presents an obstacle to higher precision tests of the SM.
We turn now to the determination of |V cb | from the exclusive decay B → D * ν, which has an interesting history. As is detailed below in Eq. (7), to determine |V cb |, a measurement of the differential decay rate dΓ/dw is needed and the form factors must be computed by theory. Using lattice QCD, it has been possible to determine the key form factor at zero recoil. However, the differential decay rate vanishes at that point because of kinematic factors, so it is necessary to use the differential decay rate at nonzero recoil to extrapolate the form factors to zero recoil. Since the 1990s, there have been two parametrizations of the form factors, one by Boyd, Grinstein, and Lebed (BGL) [5][6][7], and the other by Caprini, Lellouch, and Neubert (CLN) [8]. The CLEO and BaBar experiments [9][10][11], for example, relied on the CLN parametrization to analyze the dependence of the recoil parameter of their data. This was also true for earlier reviews from the Heavy Flavor Averaging Group (HFLAV) [12].
The situation changed in 2017 with the publication of unfolded data from a B → D * ν experiment by the Belle collaboration [13]. This data release was quickly followed by several theoretical analyses [14][15][16][17] comparing the effect of the choice of parametrization on |V cb |. They found that the CLN parametrization [8], at least as it is usually employed to extrapolate the experimental data to the zero-recoil point (see for instance, Sec.V.A of Ref. [18]), does not provide a good description of the experimental data, whereas the BGL parametrization [5][6][7] describes the data properly and yielded exclusive determinations of |V cb | that were compatible with the inclusive ones [14][15][16][17]. However, more recent analyses by the Belle collaboration using the much larger untagged dataset [18] and by the BaBar collaboration performing a new analysis of their old data [19] contradicted this picture and reinforced the long-standing tension between the inclusive and the exclusive determinations. Newer theoretical analyses using Belle's untagged dataset also found agreement between CLN and BGL results [20,21]. These analyses also conclude that CLN is still a useful parametrization given the current errors in the experimental measurements. Unfortunately, previous unquenched lattice-QCD calculations of the B → D * form factor [22,23] cannot provide constraints on the shape, because they are limited to zero recoil. Hence, a precise calculation from first principles of the form factors involved in the exclusive process performed for a range of nonzero-recoil momenta, could be extremely helpful.
Another motivation to study this process is the existing tension between experimental measurements and SM predictions of several lepton-flavor-universality-violating (LFUV) observables in B-meson semileptonic decays [24]. The ratios of branching fractions of the semitauonic and other semileptonic B → D ( * ) transitions disagree with the SM at the ∼ 3σ level when R(D) and R(D * ) are taken together [1]. Although the last HFLAV average [1] shows a large discrepancy between theory and experiment, the most recent measurements from the BaBar, Belle, and LHCb collaborations find R(D * ) to be closer to SM expectations [25][26][27][28][29]. However, a complete lattice-QCD calculation of R(D * ) is still lacking. In view of current tensions in these observables, the limitations of the available theoretical predictions, and the future improvements on the experimental side expected from LHCb and Belle II forthcoming data, an independent theoretical calculation with a tight control of systematic errors that could help to either confirm or reduce the tension is urgently needed.
In this work, we use lattice QCD to address these two points, i.e., tensions between exclusive and inclusive determinations of |V cb |, and between the SM theoretical prediction and experiment for R(D * ). Although lattice QCD has previously been used to extract |V cb | from experimental data for B → D * ν, the relevant decay amplitude has always been computed at zero recoil [22,[30][31][32], except for an early study in the quenched approximation [33]. Here we compute the form factors that contribute to the B → D * ν decay for nonzero values of the recoil parameter in full QCD with 2 + 1 flavors of dynamical sea quarks and extrapolate their behavior to the large recoil region. Instead of using the standard procedure of extrapolating experimental results to zero recoil and then extracting |V cb | using the calculated value of the form factor at zero recoil, we do a joint fit of lattice and experimental data where |V cb | is one of the free parameters. Once the decay amplitude is determined, we integrate over the whole kinematic space to find the branching ratios with and without a τ in the decay. We calculate R(D * ) and compare our results with existing experimental determinations. Preliminary reports of this analysis were presented in Refs. [34][35][36][37][38]. In keeping with previous work on the same ensembles [39][40][41][42], this analysis was blinded until a systematic error budget was finalized. The final results were then frozen, apart from unblinding.
This article is organized as follows. In Sec. 2, we introduce the formalism and present the form factors that take part in our calculation. Section 3 gives details on the ensembles available and also describes the analysis of the lattice data up to the chiral-continuum extrapolation. In Sec. 4, we discuss the systematic errors, and Sec. 5 shows the z expansion and the joint fit with experimental data that leads to our final results for |V cb | and R(D * ). Section 6 presents our conclusions. Appendix A includes details on the fit function, employed in the chiral-continuum extrapolation. Appendix B outlines the calculation of the matching factors and its errors. Appendix C explains in detail how the κ tuning correction for the heavy quarks was calculated. Finally, Appendix D provides a guide to ancillary files containing complete results for the form factors, with a full correlation matrix.

Form Factor Definitions
In this section, we set the definitions and notation used in the following sections for the form factors, ratio of correlators, currents, and renormalization factors, among others.

Form factors in the continuum
The B → D * ν process is mediated by the axial A µ = cγ µ γ 5 b and the vector V µ =cγ µ b electroweak currents. The transition matrix for this process is usually decomposed into form factors inspired by the heavy quark effective theory (HQET) [43]: where ε is the polarization vector of the D * meson, M Y is the mass of the Y = B, D * meson and p Y its respective momentum. From the four-velocities v Y = p Y /M Y , one can define the recoil parameter w = v B · v D * . To express the differential decay rate, it is convenient to introduce helicity amplitudes according to the polarization of the off-shell W boson [44]: H 0 (w) = y(w + 1) (w − r)h A1 (w) where r = M D * /M B and y 2 (1 − 2wr + r 2 ) = 1. The differential decay rate for B − → D 0 * −ν is then where η EW is a short-distance electroweak correction [45], G F is the Fermi constant determined from muon decay, and m is the charged lepton mass. Note that the scalar helicity amplitude's contribution is suppressed by (m /M B ) 2 . In practice, it is neglected for semielectronic and semimuonic decays. The processB 0 → D + * −ν needs an extra factor (1 + απ) on the right-hand side of Eq. (7) in order to account for the Coulomb attraction among the charged decay products [46][47][48]. Other electromagnetic effects, including structure-dependent corrections, are smaller, of order α/π instead of απ [46][47][48].
For the determination of |V cb |, the full angular information of the decay chain B → D * ν (D * → Dπ) is used, as discussed in Sec. 5.2. As noted in the introduction, experimental measurements of the ratio of branching fractions in Eq. (1) are in tension with the SM. To date, the B → D * form factors have been estimated with HQET, QCD sum rules, and input from experiment. With the results presented below, however, we can compute this ratio directly from (lattice) QCD: where w Max, = (1 + r 2 − m 2 /M 2 B )/2r. In R(D * ), the B-meson lifetime τ B drops out, so we form it from the ratio of partial widths.
Many papers in the literature introduce a decay amplitude F(w), defined by |F(w)| 2 = 1 − 2wr + r 2 w + 1 and refer to F as a "form factor." For example, experimental results are often reported as |V cb | 2 |η EW | 2 |F(w)| 2 . At zero recoil, F(1) = h A1 (1). Thus, previous work in lattice QCD on this decay has focused on this single number, rather than the four independent functions, h A1 (w), h V (w), h A2 (w), and h A3 (w), computed in this work.

Extracting the form factors from lattice matrix elements
Our heavy quarks (b, c) are simulated using the Fermilab action [49], as discussed in Sec. 3.1. In this framework, the lattice currents for the quark transition y → x are where x, y indicates the flavor c, b and Ψ is the Fermilabimproved field, Ψ = (1 + d 1 γ · D lat )ψ. (12) In this expression, the original heavy-quark field ψ is rotated in order to reduce the discretization errors. In particular, the coefficient d 1 must be calculated for each value of the quark mass in order to remove the O(a) terms. The lattice current J µ xy is related to its equivalent in the continuum J µ xy through the renormalization factors, where the= symbol means that both sides of the equation have the same matrix elements. In practice, the renormalization factors are only calculated approximately up to some order in a and α s . In this work, we use a technique called mostly nonperturbative renormalization [50,51] that eliminates most of the nonperturbative dependence of the renormalization factors by defining the factors, When taking appropriate ratios of three-point correlators, the dominant, nonperburbative contribution to the renormalization of the currents, collected in the flavor-diagonal renormalization factors Z V 4 xx , cancels. The remaining matching factors ρ J µ are amenable to a perturbative calculation [50]. We compute these matching factors to one-loop in perturbation theory, with the full m c a dependence at zero recoil, but m c a = 0 at nonzero recoil. Of course, these simplifications introduce a variety of errors that must be kept under control. The truncation in the perturbative expansion is expected to be small, because α s ranges in our case from 0.20 to 0.35. In addition, the coefficients of the expansion are small due to several cancelations [50]. The errors coming from the other two approximations are estimated in Appendix B and taken into account accordingly.
Not only does the use of ratios reduce the error in the calculation of the matching factors, but it also reduces the statistical fluctuations from the correlators. We set up the calculation in the rest frame of the B meson while the D * meson carries a momentum p, which determines the recoil w. The first ratio is the nonzerorecoil version of the double ratio [30], where the ⊥ symbol in the momentum p ⊥ indicates that the polarization of the D * is aligned with the current and perpendicular to the momentum (i.e., transverse polarization). A parallel symbol is used for longitudinal polarization. This double ratio yields h A1 , which is the only form factor that survives at zero recoil.
The following single ratios yield the remaining form factors. Last, the ratio [52,53] x yields the recoil parameter w and involves only the flavor-diagonal transition D * → D * . Here (α) = ⊥, is the polarization, and must be the same in numerator and denominator to achieve the right cancelation of form factors. The ratio in Eq. (19) yields the three-velocity, from which it is straightforward to calculate w. In the B-meson rest frame (v B = 0), The ratio X 1 defined in Eq. (18) can be used to extract h A3 (w) as The matching factors for these two ratios, x f and X 1 , are ρ = 1 + O(α 2 s ), and thus no renormalization is required at LO. In contrast, the remaining ratios require several nontrivial matching factors, From these equations, it is quite easy to extract all form factors as a function of the ratios defined in Eqs. (15)- (19), These expressions determine the four form factors up to discretization and matching errors.

Lattice setup
In this analysis, we use 15 ensembles of gauge-field configurations, generated by the MILC collaboration [54][55][56]. These ensembles include three flavors of asqtadimproved staggered sea quarks at five different lattice spacings, ranging from 0.15 fm in the coarsest case to 0.045 fm in the finest case. The mass of the strange sea quark is tuned to be close to its physical value, while the two light sea-quark masses are set equal, and cover a range of values that correspond to pion masses from M π ≈ 560 MeV to M π ≈ 180 MeV. The simulation parameters of all ensembles employed in this analysis are given in Table 1, while Fig. 1 provides a visual summary of the range of lattice spacings, sea-quark light-tostrange-mass ratios, and number of statistical samples.
In the light sector, we use the same value for the masses of the valence and sea quarks. The heavy quarks employ the clover action with the Fermilab interpretation, and since the regularization used for the light quarks has a different Dirac structure, we promote the staggered propagators to "naive" ones, so we can apply the standard Dirac spin algebra and combine them with the Wilson-like heavy quark propagators to construct heavy-light mesons [57]. The heavy quark masses are tuned so that the kinetic masses of the D s and the B s mesons are equal to their physical values (see Appendix C of Ref. [22]). In Table 2, we gather the parameters we used to calculate the heavy quark propagators for each ensemble. The simulation values chosen for the heavy quark masses to generate the meson correlators are close to, but not exactly the same as, our best-tuned values, which were determined a posteriori. Hence we apply a correction to the form factors to account for this slight mistuning which is described in Appendix C.

Correlation functions
The two-and three-point correlation functions are calculated using four sources, equally-spaced in time, except for the case of the coarsest ensemble that employs Table 1 List of ensembles used in this work. The columns, from left to right, list the approximate lattice spacing, the scalesetting parameter r 1 /a in lattice units, the ratio am l /ams between the light-and the strange-quark masses, the spatial length of the lattice in fm, the mass of the lightest pseudoscalar meson M P π in MeV, the dimensionless factor M P π L, the dimensions of the lattice in lattice units, the total sample size expressed as the number of sources × the number of configurations, and the tadpole-improvement factor u 0 obtained from the average plaquette. a (fm)  24 sources. The sources are randomly shifted in space and time from one configuration to another in order to reduce correlations between successive gauge-field configurations within the same ensemble. A standard blocking analysis of the correlator data, ranging from block size 1 to block size 8, reveals that the autocorrelations in our ensembles are negligible and that the errors in the correlator points stay approximately constant as we increase the block size, in line with our pre-vious analyses that employed the same gauge configurations, fermion formulations, and source set-up [22, 39-42, 53, 58, 59]. Therefore, we do not block the data in this work, and the correlators are processed through a single-elimination jackknife.
Two previous analyses with the asqtad ensembles [31,60] found that blocking the configurations by 4 or 8 was necessary in order to suppress autocorrelations. However, these analyses refer either to global observables (the topological susceptibility), or did not use the randomization procedure for the sources, which greatly reduces the autocorrelations in our data.
Given that one of our ensembles has a very fine lattice spacing a ≈ 0.045 fm, one might be worried about the topology freezing and its effect in the final results of the form factors. We did not perform a topology freezing analysis in the asqtad ensembles, but we expect the behavior to be similar to that of the HISQ ensembles. Based on Refs. [61,62], we expect topology freezing to introduce a negligible bias in the chiral-continuum limit of the form factors.
The correlation functions described in the following two subsections contain the desired ground-state matrix elements, energies and form factors, but they also include contributions from excited states, which we must remove. For this purpose, we use two different kinds of interpolating operators per source: a local operator d and a smeared operator based on the Richardson 1S wave function [63,64], and therefore we fix the configuration to Coulomb gauge. For each meson, the radius of the smearing operator is the same in physical units for all ensembles. We refer the reader to Ref. [64] for further details. The smeared operator increases the overlap with the ground state, allowing for a more precise determination of the lowest energy level and its overlap factors. The inclusion of a local operator gives us a useful handle on the excited states. Further, to quantify excited-state contributions and obtain robust estimates of the associated uncertainties, we use Bayesian constraints with Gaussian priors and fit functions that include varying numbers of excited states [65].
To implement the Bayesian constraints, we follow the procedure of Appendix B of Ref. [58]. We minimize the augmented χ 2 aug , as defined in Eq. (B3) of Ref. [58], but for the goodness of fit use the data-only χ 2 (evaluated at the minimum of χ 2 aug ) and subtract the number of parameters from the number of data. Below we refer to this χ 2 and counting of degrees of freedom as the deaugmented χ 2 /dof. In our experience, the p value calculated with a χ 2 and a number of dof as defined in Eq. (B5) of Ref. [58] is a good indicator of goodness of fit, but it has not been proven rigorously to follow a uniform distribution. When calculating p values, we further process the χ 2 /dof ratio to take into account finite sample size [66].

Two-point functions
The D * and B-meson two-point functions are needed to extract the overlap factors and the energy states, for these are required inputs for the ratio fits. The twopoint functions are constructed using interpolating op- 1S} represents the smearing (point and  Richardson), t is the time and p is the spatial momentum. These operators are constructed with the same quantum numbers as a pseudoscalar for Y = B and a vector for Y = D * . In terms of the interpolating operators, the two-point correlators are then Inserting a complete set of states between the interpolating operators, we obtain the spectral decomposition: with Z Y a,b (p) the overlap factors, L t the temporal extent of our lattice and s n (t) the extra sign that arises due to the presence of particles with the opposite parity in the staggered regularization for the fermions, The outline of the analysis of the two-point functions, explained in detail in the following subsections, is as follows. First the zero-momentum correlators are fit using phenomenological guidance for the prior central values. For the ground state, we set a prior similar to the physical mass of the mesons, and the excited states differ by ∆E = 0.5 GeV. The prior widths are large enough to accommodate significant departures from these assumptions. These choices for the central value and widths of the priors are such that they have no influence on the fit result for the ground states. In fact, we consider several variations of the energy priors to verify that their only function is to guarantee the stability of the fits without influencing the ground-state fit parameters. The results of the zero-momentum fits are used to construct priors for the dispersion-relation fits. In particular, the ground-state energies are expected to follow the continuum dispersion relation, and the overlap factors for the local operators should be approximately constant, barring, in both cases, discretization effects. Using data for a variety of momenta, we fit the ground-state energies to a dispersion-relation expression that includes discretization terms, see Eq. (36). The resulting fit is used to calculate a prior for the energy of the ground state of the two nonzero-momentum correlators.

Two-point function fits
For the two-point function fits, we employ the form where the state oscillates in time for odd i, but not for even i. This kind of fit is denoted as N +N , meaning we include N nonoscillating and N oscillating states. Both oscillating and nonoscillating excited states are fitted as the logarithm of the energy difference ∆E i = E i − E i−2 in order to avoid the collapse of two energy levels. In the higher states, we never interrelate energies of oscillating and nonoscillating states, i.e., the fitted ∆E i always refer to the difference between two states of the same type. The overlap factors in Eq. (31) are included in the fit function via For the Z factors of the ground states, we also use a logarithm, forbidding the possibility Z ≤ 0. We perform joint fits of all available correlators for a given combination of meson and momentum. That gives us three correlators corresponding to the d-d, 1S-1S, and the crossed average between the d-1S and the 1S-d operators. In the cases where we distinguish between different orientations of the polarization of the D * meson with respect to its momentum, the total number of correlators increases to six. The fitter uses the covariance matrix of the whole set of data, where the fit parameters are constrained with Gaussian priors. The prior central values for the energy levels in the fit functions for the zero-momentum correlators are guided by the experimental values for the meson mass in question and an empirical analysis of the data. The prior width of the physical (oscillating) ground state is chosen to be 140 MeV (520 MeV). In the fit functions for the nonzero-momentum correlators used for the dispersion relation, the ground-state energy prior central values are set equal to M 2 + p 2 , where M is the posterior ground state energy from the fit to the corresponding zero-momentum correlator. The width of the prior is enlarged to encompass the expected discretization errors O(α s a 2 p 2 ). The prior central value for the energy difference between two neighboring oscillating or nonoscillating states is taken to be 0.5 GeV. Their widths vary with the ensemble, but they are always larger than 0.2 GeV. The fit functions for the nonzero-momentum correlators employed in the threepoint function analysis, namely momenta 2π(1, 0, 0)/L and 2π(2, 0, 0)/L, use the dispersion-relation results as priors for the ground-state energy.
The energy levels are constrained to be the same across smearings, but the overlap factors are different, and they are represented with different parameters. For the ground states of the crossed average, we do not fit the Z j amplitudes, but we impose the exact constraint The Z j amplitudes of the excited states of the crossed average are treated as separate fit parameters as they may describe a mix of excited states. Our Z j amplitudes are also allowed to depend on the orientation of the momentum, when applicable, and Eq. (35) applies independently to each orientation. The priors for the Z j factors of the ground states follow a log-normal distribution. Their central values are estimated following an empirical examination of the data, and their widths are large enough to accommodate significant departures from those original choices, roughly within one order of magnitude. In particular, the width of the physical ground state amplitude prior is set to 0.5 for all ensembles, and the width of the posterior is usually 20 times smaller. The prior is enlarged for the nonzero momentum correlators by a factor ≈ (1 + 2α s p 2 ). In the case of the oscillating ground state, the width of the amplitude is set to 1.2 for the zero momentum correlators and 2.0 for the nonzero momentum ones, with a typical posterior width of 0.5. In contrast, we use a Gaussian distribution for the excited-state priors. The width is fixed to be 3.5 for all ensembles, whereas the width of the resulting posteriors is typically an order of magnitude smaller. We test thoroughly that the fit results for the ground states are largely unaffected by the choice of priors and prior widths, as long as the fit remains stable. The fit ranges are chosen following a systematic procedure: t Max is chosen such that the correlator points for t < t Max have fractional errors smaller than ≈ 20-30%. In this way, the covariance matrix is not contaminated by excessive noise, but the value of t Max becomes ensemble dependent. However, the correlator fits are generally insensitive to variations of t Max within this constraint. In contrast, t Min is chosen to have the same value in physical units for all ensembles and momenta. We do this because we expect the degree to which excited states influence the fit depends on their physical separation from the ground state. We apply the following four criteria to select the best t Min value: (1) When including all ensembles and momenta, the p value must follow a sufficiently flat distribution for 15 ensembles, (2) the 2 + 2 and the 3 + 3 fits must agree on the nonoscillating ground-state energy and overlap factors, (3) the fit result must be stable under small variations of the fit range, and (4) the product Z d √ 2E = Z d for the overlap factor of the ground state should be approximately independent of the momentum, barring discretization effects. When these conditions are all fulfilled, we consider that the systematic errors due to the omission of still further excited states have been included in the statistical fit error. This usually leaves us with a small range of possible values for t Min , we chose among those the one that complies best with all these conditions. The selected values are listed in Table 3. An example of the level of agreement that is reached between our 2 + 2 and 3 + 3 two-point correlator fits is shown in Table 4.
Previous experience [22,39,41,42,53,58], which also applies to this study, has shown that it is better to impose the four criteria introduced above on a set of fits, rather than choosing, on a case-by-case basis, the fit with the smallest χ 2 /dof, the smallest error, or some other notion of "best" fit. The case-by-case approach amplifies meaningless statistical fluctuations, which can introduce problems in subsequent steps of the analysis (here, the chiral-continuum extrapolation). Figure 2 shows the stability of 1+1, 2+2, and 3+3 fits on ensembles at four lattice spacings, denoting the common t Min . It illustrates that we could have chosen smaller values of t Min on some ensembles, if we had adopted ensemble- Table 3 Fit ranges in physical units for the two-point function fits used in the analysis of the form factors. As the number of included states increases, t Min is reduced to include information from the rapidly decaying excited states in the fit. For the coarsest ensembles and momentum 2π(4, 0, 0)/L, the fit ranges do not yield enough points to perform a fit. In those cases the 2π(4, 0, 0)/L point is simply dropped. by-ensemble criteria. Hence, our common t Min value is conservatively chosen.

The dispersion relation
The calculation of the dispersion relation serves two purposes: first, we can estimate a good prior for the two-point functions that enter in the analysis of the form factors; second, by checking the size of the deviations from the continuum dispersion relation, we can test whether the discretization errors due to the heavy quarks are under control. The dispersion relation which includes discretization effects can be written as where M 1 is the rest mass, M 2 is the kinetic mass, and M 4 is a further mass-like quantity. A key observation of Ref. [49] is that the matching of the relativistic Wilson action via HQET or NRQCD to continuum QCD removes discretization effects that grow uncontrollably with aM . In Eq. (36) discretization effects are described by the coefficients of the (ap) n terms, parameterized by w 4 , M 1 , M 2 and M 4 , for which explicit expressions are given in Ref. [49]. We tune the kinetic mass M 2 to match the experimentally observed mass, according to the nonrelativistic interpretation of the clover action [49]. 1 The leading O(a 2 ) discretization effects are due to M 1 /M 2 ∼ 1. Expectations for this ratio can be inferred from perturbation theory for the quark masses [68] and by tracing contributions to the binding energy [69,70]. On this basis, we expect M 1 /M 2 to be 1+O(α s , (am 0c ) 2 ), and we would like to test whether the leading deviation from the continuum dispersion relation, E 2 = p 2 + M 2 , grows as O(α s a 2 p 2 ). We can check whether our nonzero momentum fits show deviations of order O(α s a 2 p 2 ) from the continuum dispersion relation, and we can also fit the energies from our correlator fits to Eq. (36), considering the coefficients in front of the powers of momenta as fit parameters. These results are used to guide the prior central values for the groundstate energies of the two-point correlators with nonzero momentum that are part of the three-point analysis which yields the form factors. In order to make this prior independent of the form-factor data, we exclude the p = 2π(1, 0, 0)/L, 2π(2, 0, 0)/L momenta from the dispersion-relation calculation. As explained in Sec. 3.3, data for different polarizations of the D * meson are available only for p = 2π(1, 0, 0)/L and 2π(2, 0, 0)/L. Therefore, these are the only momenta for which we obtaine the form factors at nonzero recoil.
In Table 5, we show the results of our two-point correlator fits that enter in the dispersion-relation fit for a particular ensemble. There is good agreement between the 2 + 2 and the 3 + 3 state fits, indicating that the systematic error from the omission of higher states is negligible. Results for other ensembles show a similar behavior. Figure 3 compares the continuum dispersion relation with our data. The data points show small discretization errors, which tells us that, indeed, these errors are under control.

Three-point functions
With our previously defined interpolating operators, we can also construct three-point correlators by sandwiching a current between two meson states, Using the same notation as in Eq. (31), we can write the spectral decomposition of the three-point correlators for Table 4 Results for the D * -meson two-point correlator fits on the a ≈ 0.12 fm, m l = 0.14ms ensemble. We compare the nonoscillating ground-state energy and overlap factors for the 2 + 2 and 3 + 3 state fits. We include here all fits that distinguish between the different orientations of the momentum. In the analysis we use the 3 + 3 state fit result.
(1,0,0) 1.0837 (21) Table 5 Results for the D * -meson two-point correlator fits on the a = 0.12 fm, m l = 0.14ms ensemble. We compare the nonoscillating ground-state energy and overlap factors for the 2 + 2 and 3 + 3 state fits. We include here all fits that did not distinguish between the different orientations of the momentum. Momentum 2π(4, 0, 0)/L did not have enough degrees of freedom left for the 3 + 3 state fit, and hence it is not shown. In the analysis we use the 3 + 3 state fit result. (17) 1.0405 (17) 3.079 (25) (59)  Ratio between the fitted energies and the expected energies from the continuum dispersion relation. The color encodes the lattice spacing, whereas the different symbols encode the light-to-strange mass ratio. The cones show the expected size of the discretization errors per lattice spacing for our lattice formulation. Hence, each cone follows the equation αsa 2 p 2 , where αs is different for each lattice spacing, and it is calculated following Refs. [71,72]. The cones are color coded to match points with the same lattice spacing. All plotted points for each particular lattice spacing lie within their respective cones, which in turn means that the discretization errors are well within the expected size.
a particular source-sink separation T as where we choose t < T L t , such that wraparound In our three-point functions, we always use a Richardson 1S smearing for the B meson, but the D * meson operator is either 1S-smeared or point d. This gives a variety of possibilities for constructing ratios of correlators. For x f we use where a = d, 1S and the orientation of the momentum can be arbitrary, as long as it is the same for the correlator in the numerator and denominator. These combinations cancel the leading overlap factors and exponentials. The same cancelation can be achieved in X V , In these two ratios we can find the desired matrix element in the limit t 0 and T − t 0, with t < T . The double ratio and the other two single ratios can be expressed in the same way but in this case the computed ratios depend on extra factors that must be removed before extracting the matrix elements. The overlap factors are removed per jackknife bin using the results of the two-point correlator fits. In this way we can propagate correlations from one fit to the other. The M D * /E D * = 1/w factor in Eq. (43) is removed using the value of the recoil parameter, as extracted from Eq. (21) per jackknife bin. The double ratio R A1 deserves further comment. First, we reanalyze the zero-momentum correlators [22] using the criteria given above. That also implies the double ratio R A1 (p = 0) is constructed only for the a = 1S smearing. Second, we did not generate three-point functions at nonzero momentum of the form C Aj D * a →B 1S (p ⊥ , t, T ), so we use the time reversal operation T to obtain the missing correlator,

Three-point function fits
The three-point functions are also affected by the oscillating states introduced by the staggered regularization. So are the ratios constructed with such three-point correlators, but the dependence on the oscillating states is not as clean as in the case of the two-point functions. Our ratios do not show any noticeable oscillatory behavior in source and sink, but states that oscillate at both ends introduce a nonnegligible overall shift on the ratio central value that depends on the sink time T as (−1) T . In order to remove this contribution, we smooth the data following Refs. [22,31,53,73], namely, we calculate the three-point correlators at two different values of the sink time T , and then we compute the following weighted average to suppress this unwanted shift in most ratios: with R = X 0 , X 1 , X V , x f and R A1 (p = 0).
The contribution of the oscillating shift is then greatly suppressed.
The double ratio at nonzero momentum, R A1 (p = 0), requires the explicit removal of the sink-dependent exponentials in order to avoid bias, These exponentials are removed using the energy and mass values coming from the two-point correlator fits per jackknife bin. The ratio averages defined in Eqs. (45) and (46) suppress the contributions from the unwanted oscillations to a fraction of the statistical errors. Therefore, we henceforth employ only the averaged ratios in our analysis and omit the bar for simplicity. The data are then processed through a single elimination jackknife, and the extra overlap factors and exponentials are removed by using the values obtained in the twopoint correlator fits per jackknife bin. Then the ratios are fitted to the functional form: where K is the matrix element we want to extract, and the extra terms take into account the presence of excited states, assuming their contribution is small. The labels X, Y represent mesons at source and sink, respectively, and ∆E j X,Y represents the energy difference between the ground state and the j th excited state. The second excited states at source and sink (included in the A 2 and B 2 terms) are necessary to remove systematic errors due to unaccounted excited states. In order to check this point, we computed the ratio x f with different polarizations of the D * meson. We expect the extracted matrix elements from different polarizations to agree, except for discretization effects that should be reduced as the lattice spacing decreases. Nonetheless, our results show a difference between the analysis with a single excited state at source and sink and the analysis with two excited states at each end. The addition of extra excited states not only increases the error, as expected, but also brings the central values calculated with different polarizations closer. Overall there is a large reduction in the difference between the cases with polarization parallel and perpendicular to the momentum. This behavior depends only mildly on the lattice spacing, as can be checked in Fig. 4.
Difference in sigmas between x f calculated with parallel and perpendicular polarizations, for momenta p 2 = 1, 4 and one (2 + 2) and two (3 + 3) excited states at source and sink in the ratio. The gray band corresponds to differences ≤ 1σ. It is clear that an increase in the number of excited states reduces the difference between polarizations to 1σ or less, whereas with a single excited state the difference can reach as high as 2σ in some cases. Since we do not have any other way to keep in check the systematics due to excited states for other ratios, we take a conservative approach and use the 3 + 3 state fits for the rest of the analysis.
The two available sets of correlators with smearing operators a = d, 1S are fit simultaneously, so that they share ∆E and K, but each smearing has its own A 1,2 and B 1,2 . We employ a loose prior for K with a central value roughly set by the ratios at ≈ T /2, where the excited states are more suppressed, and with a width large enough to accommodate significant variations: in the case of the double ratio R A1 the width of the prior is set to 0.1, whereas the other ratios use 0.05. In all cases, the width of the prior encompasses all available correlator points for the single ratios. Except for the double ratio, the excited states at source and sink carry different signs, hence we ensure the prior covers the central value of the matrix element. Typically, the posterior is almost an order of magnitude narrower that the prior, although in the least precise cases th the posterior width is ≈ 60% of the prior. The priors for the ∆E of the first (second) excited states are taken from the two-point function fits with 3 + 3 states, but we increase the error by a factor of three (eight) to allow ∆E to differ from the two-point correlator values. This increase takes into account the fact that the t Min in the ratio fits is much smaller than in the two-point function fits, and the excited-state pattern might be different as well. The priors for A 1,2 and B 1,2 are taken to be 0(2) and 0(1) in the smeared and point source cases respectively. 2 As stated above, our priors are conservative enough that significant variations of their central values and/or widths do not result in a relevant change in the posterior for the matrix element.
The fit ranges are chosen following criteria similar to the two-point function case. We use the same value of t Min and t Max in physical units for all ensembles and all ratios, except for the double ratio, where we use t Max = T −t Min to account for the fact that the states on source and sink are exactly the same. In this case, we take the same t Min as for the rest of the ratios. We choose the fits that show stability over small variations of the fit range and result in a reasonably flat distribution for the p values.

Calculation of the recoil parameter
As in previous work [52,53], we use the ratio x f to define the recoil parameter, following Eq. (21). The disadvantage of this method is that it introduces systematic errors due to the renormalization of the currents. One could also use the continuum dispersion relation to define the recoil parameter: where p is the three-momentum of the D * meson, and the mass is either M 1 or M 2 . The different choices for the different definition of the mass are expected to result in slightly different discretization errors that are resolved in our chiral-continuum extrapolation, so the choice should not affect the final results. As shown in Fig. 5, the error in the x f method encompasses the differences in the rest-and the kinetic-mass versions of Eq. (48). In this work, we take a conservative approach and define w via Eq. (21), but we note that all choices lead to results for the form factors, |V cb |, and R(D * ) that are compatible within errors.

Current renormalization and blinding
As outlined in Sec. 2.2, the ratios described in Eqs. (39)- (43) are constructed in such a way that the flavordiagonal renormalization factors Z V 4 cc Z V 4 bb from Eq. (14) cancel out. Hence, what remains is only the computation of the different matching factors ρ X that enter in the ratios. These factors can be calculated using perturbation theory, but the calculation becomes cumbersome for w > 1. In this article, we use the approximation am 2c → 0, where am 2c is the charm kinetic mass, which removes the dependence on w, because a light quark cannot modify the dynamics of a heavy quark in the heavy quark limit (see Appendix C and Refs. [50,74]). Then we incorporate errors coming from these approximations. The w dependence introduces an error proportional to w − 1, and the am 2c → 0 approximation increases the error by O(α s am 2c ).
The calculation of the axial matching factor ρ Aj (1) then follows exactly the procedure of Ref. [22]. The other ratios require further calculations that are detailed in Appendix B. The resulting values for all matching factors are gathered in Table 6, where the errors shown are from the VEGAS integration. The errors associated with the approximations we make to obtain those factors are discussed in Appendix B. They are added to those in Table 6 before carrying out the chiralcontinuum extrapolation.
We also calculate the matching factors for the ensembles involved in the heavy-quark (HQ) mistuning corrections. This is a departure from Ref. [53], where the matching factors are applied after the HQ-mistuning correction. In this way, we completely separate the HQmistuning corrections from the matching errors.
During the analysis, we blinded the form-factor data by multiplying the matching factor ρ Aj by an unknown, random number close to 1. The ratios ρ A4 /ρ Aj and ρ Vj /ρ Aj are left unchanged. Via the correlator ratios and Eqs. (26)- (29), all form-factor values on all ensembles are thus multiplied by a common factor. All stages of the analysis were tested either through independent fits performed by two co-authors or by using independent methods or codes. Only after the full analysis was complete, including construction of the systematic error budget (Sec. 4) and z expansion (Sec. 5), the blinding factor was removed from ρ Aj , and the analysis scripts were rerun to extract the unblinded results for the form factors.

Heavy-quark-mass adjustment
The bare masses for the b and c quarks are tuned such that the kinetic masses of the B s and D s meson on each ensemble are equal to their physical values. Nonetheless, the tuning procedure has errors that must be taken into account. The procedure is explained in Appendix C and outlined here. As we generate configurations, values for the heavy-quark masses that give approximately the correct meson masses can be estimated. These initial values are employed to compute the two-and threepoint functions that we analyze in this work. At the end of the data generation, the much larger statistical sample allows for a more precise determination of the b and c quark masses by using the procedure detailed in Ref. [22]. We must then correct for this mismatch.
The correction is calculated nonperturbatively by studying the effect of a varying heavy-quark mass in a single ensemble on all correlation functions. Since the heavy-quark mismatch is small, a linear fit in 1/m Q for each flavor Q = c, b and form factor is usually sufficient. Once the functions that describe the evolution of the form factors with the quark masses are known, we can apply the correction to all other ensembles. Table 20 in Appendix C gathers all our adjustments. The error in the correction comes from the error in the heavy-quarkmass tuning procedure and from the error of the linear fit to find the heavy-quark mass dependence.

Chiral-continuum extrapolation
After applying the renormalization factors and the heavyquark-mistuning corrections as described in previous subsections, the resulting form factors still need to be corrected for the fact that they are calculated at nonzero lattice spacing and nonphysical values for the lightquark masses. An extrapolation to both the continuum limit and the physical value of the light-quark masses is thus necessary to extract values that can be used in a physical calculation. The extrapolation should be based on an appropriate effective field theory (EFT) description of lattice QCD. The relevant EFT for the calculation at hand is rooted staggered chiral perturbation theory (rSχPT), which describes how the form factors behave as the lattice spacing and the light-quark masses approach the desired limits, extended to include heavylight observables [75]. The unquenched MILC configurations generated with 2+1 flavors of improved staggered fermions make use of the fourth-root procedure for eliminating the unwanted four-fold degeneracy of staggered quarks. At nonzero lattice spacing, this procedure has small violations of unitarity [76][77][78][79][80] and locality [81]. Nevertheless, a careful treatment of the con- Comparison of different definitions of the recoil parameter. The horizontal axis labels the ensemble, whereas the vertical axis shows the recoil value w, calculated from the different definitions. The ratio method is shown before ("Ratio") and after ("Ratio matching") including the matching errors, whereas the points labeled as "Dispersion relation M 1,2 " use Eq. (48) and the fitted rest (M 1 ) or kinetic (M 2 ) mass of the D * meson. The three-momentum is set to p = 2π(1, 0, 0)/L [p = 2π(2, 0, 0)/L] in the lower (upper) panel.
tinuum limit, in which all assumptions are made explicit, argues that lattice QCD with rooted staggered quarks reproduces the desired local theory of QCD as a → 0 [82,83]. When coupled with other analytical and numerical evidence (see Refs. [84][85][86] for reviews), this gives us confidence that the rooting procedure is indeed correct in the continuum limit. We then use the following functions obtained in SU(3) rSχPT to lowest nontrivial order in the heavy-quark expansion to fit the different form factors: where Y = A 1 , A 2 , A 3 , and V ; K A2 = 0 but K Y = 1 otherwise; k A1 = 2 but k Y = 1 otherwise. These expressions contain the correct dependence in χPT on the light-and strange-quark masses, the lattice spacing, and the recoil parameter w at next-to-leading order (NLO). This result expands on the one in Ref. [87] by adding the missing recoil dependence in the relevant Table 6 Matching factors calculated at one loop in perturbation theory for all ensembles and form factors. The errors shown in the table do not include the systematic errors coming from the approximations employed in the calculation of the matching factors. They are nonetheless included in the chiral-continuum extrapolation. These errors and more details on the calculation are described in Appendix B. The last two columns include the values of α V and m 2c a used in Appendix B to estimate the matching errors. places. The terms introduce nonanalytic dependence on the light and the strange-quark masses through the chiral logarithms logs Y SU (3) . Those terms also include the leading tastebreaking discretization effects from the light-quark sector. The explicit expression of the logarithms for each form factor is given in Appendix A and includes a dependence on the recoil parameter w. The coefficient of the chiral logarithms comes from χPT and it is known, but the current determinations of the coupling g D * Dπ are not very accurate, hence we fit the coupling with a Gaussian prior 0.53 ± 0.08, compatible with experimental data [88][89][90] and lattice-QCD results [91][92][93][94][95][96]. We fix the pion decay constant appearing in the chiral logs in Eq. (50), and elsewhere in the fit function to the three-flavor FLAG 2019 average with the error increased by the estimated 0.7% charm sea-quark contribution f π = 130.2 ± 1.2 MeV [97].
The other terms in Eq. (50) introduce analytic NLO corrections in the light-quark masses through x l = 2B 0 m/(8π 2 f 2 π ) and in the lattice spacing through x a 2 = [a/(4πf π r 2 1 )] 2 , where B 0 is the low-energy-constant (LEC) of χPT that relates the light-and strangequark masses with the meson masses. The value of B 0 for each lattice spacing is the same as in the earlier analysis at zero recoil [22], which uses exactly the same ensembles, and is given in Appendix A. We take into account truncation errors by including the term which describes the dependence on the light-quark masses and the lattice spacing a at next-to-next-toleading order (NNLO), not including logarithmic terms. According to χPT power-counting, these analytical terms are expected to have coefficients of O(1), so we take them as fit parameters with priors 0 ± 1. We don't include analytical terms in the strange-quark mass because we do not have data at different values of m s and nonzero recoil, and our h A1 (1) result using only zero-recoil data agrees within errors with our previous result from Ref. [22]. Also, χPT predicts a much milder dependence on m s than on the light-quark masses. We allow a simple NLO analytical dependence on w to describe the behavior in our small-recoil range through the term where the fit parameters ρ Y and κ Y are related to the slope and curvature of the form factor h Y respectively. We can reasonably expect the slope of the form factors to be roughly 1, but in order to accommodate substantial deviations from this value, we set the prior of ρ Y to 1 ± 2. The priors of κ Y are chosen to be 0 ± 3, and the posteriors are compatible with zero within a fraction of a sigma. The constant term χ Y (Λ χ ) in Eq. (49) is a LEC of the chiral effective theory, and it is suppressed for h A1 by a factor of 1/m 2 c due to Luke's theorem [98], whereas the other form factors receive contributions of order O(1/m c ) in the heavy-quark power counting. The dependence of this LEC on the chiral scale Λ χ cancels against the dependence of the nonanalytical terms in Eq. (50). We set the prior of this LEC to 0 ± 1, except for h A1 where we use 0.0 ± 0.2 to reflect the suppression due to Luke's theorem. The last term in Eq. (49) accounts for the heavy quark discretization errors, corresponding to the form factor h Y , and Λ QCD = 0.6 GeV for normalization purposes. In previous articles, 3 we have employed the universal functions described in Refs. [49,67]. But in those cases there was only one heavy meson. Here we need to deal with the B and the D * , and our data are not accurate enough to distinguish the different terms described by the universal functions. In order to avoid terms that mimic the effect of others, we consider it a better strategy to implement generic discretization error terms, which would account for the same dependence as the universal functions. A side effect of this approach is that the heavy-and NLO light-quark discretization effects become mixed together through terms with the same dependence on the lattice spacing. To avoid this, we drop the O(a 2 ) term from Eq. (53), which has the same dependence as the O(a 2 ) term already present in Eq. (50), and we enlarge the prior of the latter assuming that both corrections are independent, i.e., with a quadrature sum. The priors for the β p Y coefficients are set to 0 ± 1, but the O(a 2 ) coefficients of Eqs. (50) and (53) have different normalizations. For this reason, the final prior for c a2,Y becomes 0.0 ± 6.1. This approach does not allow us to distinguish cleanly the origin of the discretization errors, but it accounts for the correct dependence and size of discretization errors. However, absorbing the O(a 2 ) term from Eq. (53) in Eq. (50) may have further effects, since the correction in Eq. (53) is applied to the chiral-continuum fit function in Eq. (49). It is possible that this procedure does not account for discretization effects in the shapes of the form factors, which would give rise to higher order terms of the form a 2 (w − 1) and a 2 (w − 1) 2 . We test for such effects by performing two alternate chiral-continuum fits. In the first, we add the terms x a 2 (w −1) and x a 2 (w −1) 2 to Eq. (52), where the priors for the coefficients of these terms are chosen as 0(1). We find that the results of this fit differ by at most 0.1σ in their central values from our base fit, while the statistical fit uncertainty is unchanged. In the second variation, we keep the O(a 2 ) term in Eq. (53). In 3 See, for instance, Ref. [64]. this case, the central values are consistent with those of our base fit within 0.25σ, again with an unchanged uncertainty. Hence, we conclude that such discretization effects are already accounted for in our base fit.
Since each ensemble is statistically independent from the others, there are no correlations among them. On the other hand, we keep track of correlations both between different form factors within the same ensemble and within the same form factor calculated at different momenta, by combining the jackknife data of all form factors into a large, block-diagonal dataset. Our large statistics allow us to resolve the full covariance matrix without resorting to thinning procedures or singular-value-decomposition cuts on its eigenvalues. Nonetheless, we use the shrinkage procedure described in Refs. [99][100][101][102] to ensure the small eigenvalues of the covariance matrix have the correct behavior, and we find that our results do not change with respect to the analysis without shrinkage.
The systematic errors coming from the heavy-quark mistuning corrections and those introduced by the matching factors are built into our chiral-continuum extrapolation by constructing the combined covariance matrix, where the first term includes the statistical covariance, and the second and the third ones account for the matching factor and heavy-quark-mass mistuning correction errors respectively. The i, j indices run over all form factors, ensembles, and momenta. With δ (ρ,κ) i we represent either the shift in the i th datum due to a correction the heavy-quark mass (κ) or the propagated error of the form factor from the errors in the matching factors (ρ) as calculated in Eqs. (B.49a)-(B.49d). As a result, the systematic errors introduce new correlations between all data points. In fact, Eq. (54) assumes the worst case scenario that the matching systematic errors and the errors coming from the heavy-quark mistuning are 100% anticorrelated.
The extrapolation results for the four form factors are shown in Fig. 6. As one can see, h A1 , which is protected by Luke's theorem, receives small corrections from 1 at w = 1. The other form factors do not enjoy this privilege, and for them the plots show large corrections from the HQET limit. Figure 6 also shows the result of the previous Fermilab-MILC calculation at zero recoil w = 1 for comparison [22]. The agreement is good, although the errors have increased, mainly due to more conservative choices in this work, which stem from the data at nonzero recoil requiring an extra excited state at source and sink in the ratio calculations, which resulted in larger errors. For consistency, we employed the same approach at zero recoil as well. The deaugmented χ 2 /dof of the chiral-continuum extrapolation is 85.2/95.

Systematic errors
This section provides specific information on our estimates of every source of systematic error in the determination of the form factors h X . Even though only h A1 contributes to the decay amplitude at zero recoil, all form factors are nonzero at w = 1, and their errors need not be suppressed at small recoil. Even if the errors of h V , h A2 , and h A3 become large, however, their contribution to the decay amplitude and, hence, the resulting uncertainty in the decay amplitude is still suppressed at small recoil.
Some general features of the uncertainties in the form factors can be understood via HQET. The form factor h A1 is protected by Luke's theorem [98] and, indeed, we find HQET corrections of a few percent. The form factors h V and h A3 , which are not protected by Luke's theorem, receive HQET corrections at the ∼ 30% level. The form factor h A2 starts in HQET with terms of order α s and 1/m c , which is roughly consistent with our data, h A2 ∼ − 1 2 . Figure 7 shows the error budget for the different form factors in the continuum as a function of the recoil parameter. The relative uncertainty in each form factor follows the same pattern as the HQET corrections: small for h A1 , moderate for h V and h A3 , and large for h A2 . In the last case, the relative uncertainty is large, because the overall value of h A2 is smaller than the others.
Our chiral-continuum extrapolation ansatz to NNLO incorporates errors from statistics, choices in the chiralcontinuum extrapolation, discretization effects, O(am c α s ) matching errors, and heavy-quark parameter mistuning. Thus, they are all entangled in the fit, and it is not straightforward to extract each particular contribution. In addition, our treatment of the heavy-quark discretization errors includes a term identical to one of gluon and light-quark discretization errors. We can, however, roughly estimate each contribution by making modifications to the fit. In this spirit, we define the statistical contribution to the error as the error obtained in a NLO fit without mistuning correction or matchingfactor errors included. We have a specific way to deal with the matching factors, which is explained below. The contribution coming from the chiral-continuum extrapolations is estimated by comparing the fit errors with and without NNLO terms.
There are more contributions to the final error that have been taken into account: light-quark mass mis- Table 7 Error budget for all form factors at w = 1.11. The first row shows the combined error coming from our chiralcontinuum fit, which encompasses the statistical errors, the matching errors, the systematics due to our chiral-continuum extrapolation, errors coming from HQ-mistuning corrections, and discretization errors. The next several rows show estimates of the individual contributions (in parentheses as a reminder that they are contained in the first row). Since the terms that describe the discretization errors come from both the (N)NLO terms in the extrapolation and the HQ discretization terms, there is an overlap between the statistical errors (determined as those of a chiral-continuum extrapolation at NLO without any matching or heavy-quark mistuning errors taken into account) and the discretization errors, and the sum in quadrature of the numbers in parenthesis does not equal the first row. The remaining rows show other contributions, which are added to the first row in quadrature to obtain the total error in the last row. Dashes represent terms so small that were not included in the final computation of the error. tuning, scale setting, isospin effects, and finite-volume effects. The final error is taken to be the quadrature sum of these uncertainties with that of the chiral-continuum extrapolation error, which (again) includes statistical, chiral-continuum extrapolation, discretization, heavy-quark mistuning, and matching errors, as shown in Table 7. In the rest of this section, we discuss each source of uncertainty one by one, explaining how they enter this error budget.

Statistics and stability of the correlator fits
In principle, the determination of masses, energies, and form factors depends on choices made in fitting the twoand three-point correlation functions, but we argue that the associated uncertainties are encompassed in the statistical component of the first line of Table 7. We have analyzed the two-point functions with both 2+2 and 3+3 states in the fit. Only when the two results agree within statistical errors do we select a particular fitting range. In this way, the influence of excited states is reduced below the statistical uncertainty of the B masses and D * energies. For the three-point-function ratios, we find that excited states play a more important role,  as can be seen for the example of x f in Fig. 4. When fitting the three-point functions, we therefore include extra states at the source and sink in order to control this potential source of systematic error.
Bias can arise from the choice of fitting ranges. To avoid the problems that can come from choosing different fitting ranges for different ensembles, we impose the same t Min in physical units for all two-point correlator fits. Our t Max is chosen differently and varies from ensemble to ensemble, but the impact of a different t Max is much smaller, because these points have much larger errors. We refer to the reader to Sec. 3.3.1, where all the details are explained. For the form-factor ratio fits, we employ the same range in physical units for all ensembles and most ratios. For the double ratio R A1 and x f , where the same pattern of states is expected at source and sink, we use a symmetric fit range with t Max = T − t Min . Our fits also take into account all correlations between the data points fitted, and, as pointed out in Sec. 3.2, we find that autocorrelations in our data are negligible.  The two largest contribution come from statistics, in blue, and quark discretization effects, in orange. These two contributions overlap in the brown band in the plot, because of the common term of order a 2 in Eqs. (50) and (53). The remaining contributions do not overlap. Note the differences in vertical scales. as explained in 3.2. We then verify that these p values follow an approximately uniform distribution.

Stability of the chiral-continuum extrapolation
To assess the stability of the chiral-continuum extrapolation, we repeat the fit for several different functional forms. Compared with the base fit, we omit, in turn, the NNLO terms, data at w > 1.10, the coarsest ensemble, the finest ensemble, and heavy-quark discretization terms of order a 3 . The results are very stable under modifications, as shown in Fig. 8. The most dramatic changes occur when we remove the w > 1.10 data, leading to shifts of around one standard deviation in the worst case. Obviously, the large-recoil extrapolation is affected when removing data at larger recoil, but we find that the z expansion, discussed below in Sec. 5.1, stabilizes the final results for |V cb | and R(D * ) in this respect. Table 8 shows that the quality of fit remains good, with χ 2 /dof 1, in all cases.

Discretization errors
The improved action in the ensemble simulations has light-quark and gluon discretization errors of order α s a 2 . Simple power-counting arguments suggest that the discretization errors range from ∼ 0.5% in the finest ensemble to ∼ 10% in the coarsest one. The chiral-continuum extrapolation describes, however, such terms via the lattice-spacing dependence of the chiral logarithms and the analytical terms proportional to a 2 . Moreover, the form factors do not seem to be sensitive to the lattice spacing. Hence, we expect the chiral-continuum extrapolation to take all these errors into account, and no  further systematic uncertainties are added to the final result.
In order to take into account the discretization errors coming from the lattice treatment of the heavy quarks, we include extra terms in the chiral-continuum extrapolation as shown in Eq. (53). These terms are motivated by the HQET description of cutoff effects [50,103], which uses HQET to derive the mismatch between the lattice gauge theory at hand and continuum QCD. The result is a set of functions that depend on the heavy-quark mass, and that can account for discretization effects of different sizes (in our case, order α s a, a 2 and a 3 ).
In our analysis, we would like to introduce these functions for both the B and the D * mesons. Our data do not, however, distinguish between the contributions of the two mesons, so we instead use a single generic term for both mesons. Eq. (53) shows the terms used in the end: α s a, a 2 , and a 3 . Since the a 2 term is already included in the light-quark discretization errors, it would be superfluous to include it again here. The downside of this approach is that it is impossible for us to disentangle light-and heavy-quark discretization errors in the error budget, and as such, we report them together.
One can estimate the size of these individual effects from variations of the chiral-continuum extrapolation with and without the terms in Eq. (53), and also removing the O(a 2 ) term coming from NLO corrections in Eq. (50). Heavy-and light-quark discretization errors turn out to be the largest contribution to the total error in our analysis, and the inclusion of the terms listed in Eq. (53) is key in order to account for the heavy-quark systematic errors.

Matching errors
The matching factors are calculated at one-loop order in perturbation theory. Appendix B explains how we estimate the uncertainties, listed in Eqs. (B.49a)-(B.49d). They are included in the chiral-continuum extrapolation through Eq. (54). We can estimate the error introduced by the uncertainty in the matching up to order am c α s by removing the contribution of the matching factors to Eq. (54). The effect of higher order contributions is estimated by including an overall factor (1 + r h X 2 α 2 s + r h X 3 α 3 s ) multiplying Eq. (49) in the fit and checking the shift in the central value of the form factor. The priors for the r h X 2,3 coefficients are set to 0(1); the posteriors have central values and widths close to those of the priors. We see no impact in including O(α 3 s ) terms, but the O(α 2 s ) contribute to the final error at the subpercent level. We collect all observed differences in the corresponding line in Table 7.

Heavy quark mistuning
The form factors are adjusted for the differences between the simulated masses of the heavy quarks and the physical ones before the chiral-continuum extrapolation. The correction procedure is detailed in Appendix C. The largest correction is about 1σ, but in general the correction is negligible. Equation (54) includes the contribution of the mistuning in the chiral-continuum extrapolation, therefore, we do not need to add any further error. Switching off these corrections gives small variations in the results in our chiral-continuum extrapolation, as shown in Table 7 and Fig. 7.

Light quark mistuning
The endpoint for the light quark masses in the chiralcontinuum extrapolation is set to r 1 m l = 0.003612(126) [104]. We can determine the uncertainty in the form factors coming from a mistuning in the light quark mass by varying r 1 m l within 1σ and monitoring its effect on the form factors. The resulting uncertainty is shown in Table 7 and Fig. 7.

Scale setting
In order to determine the relative lattice spacing, we use the distance scale r 1 /a defined from the force between static quarks [105,106], which has been extensively computed [54]. Absolute scale setting is taken from the chiral f π analysis of the MILC collaboration [64], leading to r 1 = 0.3117 (22) fm. The form factors are dimensionless, so uncertainties from scale setting appear indirectly through the tuning of the heavy-quark masses, the setting of the light-meson masses in the chiral logarithms, and in the approach to the continuum limit. We estimate the systematic error associated with r 1 /a and r 1 by propagating their errors to the final result. We find that the form factors change only slightly when we vary r 1 or r 1 /a by ±1σ, and we include an extra error associated to this variation as shown in Table 7 and Fig. 7.

Isospin effects
The whole calculation of the form factors has been done assuming isospin symmetry. The main effect of isospin breaking is to modify the endpoint of the chiral extrapolation through a change in the pion mass. This effect could bring the endpoint of the extrapolation closer to the Dπ-threshold cusp described by the chiral logs. We estimate the errors introduced by this approximation by varying the endpoint of the extrapolation in the pion mass from m π 0 to m π + , by modifying the value of r 1 m l from 0.003612 to 0.004065. While the pion-mass difference is mainly due to isospin breaking QED effects, here we are using it as proxy for the valence quark mass difference, to which we do not have direct access. Following the resulting difference, we assign an error ranging from 0.0% to 0.5%, depending on the form factor and the value of the recoil parameter, as shown in Fig. 7 and Table 7. This increase in the error has no impact in the final result for the form factors.
As an alternative way of estimating these effects, we have also tried to move the endpoint of the extrapolation to isospin symmetric points with m l = m u and m l = m d . The difference of the values of the form factors between these two endpoints overestimates the isospin breaking errors, because it includes sea-quark effects that cancel out at first order. The estimate of the isospin breaking errors is larger with this method, but it is still negligible. Hence we can safely assume that isospin effects are insignificant at our current level of precision.

Finite-volume effects
To estimate finite-volume effects in our heavy-light χPT description of the form factors, we replace the loop integrals by discrete sums. Following Refs. [107] and [87], we estimate the correction to the integrals in the formulas appearing in B → D * at zero recoil to be smaller than 0.01%. This comes from the fact that the contribution of the chiral logarithms to the form factors is quite small. We have not calculated the corrections at nonzero recoil, and one also expects an increase in the error close to the cusp of the chiral logs. Given that M π L > 4 on most ensembles, and M π L ≥ 3.7 always, there is no reason to expect such a large increase in the error as to make the finite-volume corrections sizable. Hence, we do not assign any additional error due to them.

Determination of |V cb | and R(D * )
After calculating the form factors, we can reconstruct the decay amplitude using Eq. (9) and use experimental data to extract |V cb |. Similarly, the form factors lead directly to R(D * ) via Eqs. (1), (7), and (8). There is a problem: the form factors are obtained only at small values of the recoil parameter, and an extrapolation to large w with the chiral-continuum fit formula would greatly increase the error. To bring the large w behavior under control we use a standard, model-independent parametrization based on unitarity and analyticity to extrapolate the form factors to the large recoil region.
Historically the CLN parametrization [8] has been widely used for this process. However, recent developments have called into question the reliability of CLN fits, given the high accuracy of the latest experiments and calculations [14,16]. Apart from using outdated data to derive the coefficients of the expansions, the main criticism of CLN in its most common usage is the lack of error estimates for theoretical ingredients. Even though the original CLN article [8] includes equations defining the covariance matrix of the slope and the curvature of the reference form factor, the final expressions omit this information. Finally, the strong unitarity constraints, based on heavy-quark symmetry, play an important role in the CLN parametrization. Instead of introducing these constraints as an additional assumption, we would rather perform a fit without imposing them at the outset and then use them after the fits as a consistency check.
To perform the z expansion we therefore use the completely general BGL parametrization [5][6][7]. Nonetheless, we compare our results with an updated version of CLN in Sec. 5.4.2. For a review on the status of the parametrizations for heavy-to-heavy decays, see Ref. [108].

z expansion with the BGL parametrization
The z expansion is based on a conformal map that takes w to a variable z, which remains small over the physical region for the decay process, namely, The value N = 1 is most commonly used, because it fixes the point z = 0 at zero recoil, but a symmetric range has been advocated, with the claim that it reduces errors in the expansion [7,8]. With N = 1, the maximum recoil point for massless leptons w Max ≈ 1.503 becomes z Max ≈ 0.056, so indeed z 1 in any case, and any reasonable expansion in z with coefficients of order 1 converges with a few terms. The conformal map given in Eq. (55) also pushes the branch cut in the w plane onto the unit circle, |z| = 1, and subthreshold poles onto the real axis near z = −1. Since the nearest threshold is very far (|z| ∼ 1) from the valid kinematic range (0 ≤ z 0.056) and higher-energy thresholds even farther, there is no advantage in using an alternative parametrization, such as the one proposed in Ref. [109], that would take care of the behavior at such values of z. The BGL parametrization does not apply directly to the h X form factors, but to combinations with definite spin-parity [108]: where the functions P i (z) are called Blaschke factors, and the φ i are known as outer functions. As discussed below, wise choices of φ i (z) make the coefficients of the expansion a i,j of order 1 and ensure rapid convergence of the series. The Blaschke factors are given by with They include the explicit poles with mass M p below the BD * threshold and with the appropriate quantum numbers. Table 9 shows the poles we use for the BGL form factors. Although some analyses employ four 1 − resonances, the fourth one is very far from z > 0 and its value uncertain. We therefore follow Ref. [15] and use only three. The z-expansion coefficients of the BGL form factors are then defined via where the Blaschke factors' subscripts denote the J P of the lν final state. Setting N = 1, we choose the outer functions to be [5][6][7] where the undefined symbols under the square root are given in Table 10, along with the numerical values we use for M B and M * D . In testing the effect of this pole and various other choices for the pole positions, we find that although the numerical values of the z-expansion coefficients depend on the details, the final curves for the form factors are largely independent of such choices. Table 10 Inputs for the outer functions, taken from Ref. [15] and references therein. The χ parameters are calculated in perturbative QCD through O(α 2 s ), and depend on charm and bottom quark mass inputs, see Ref. [111].

Input
Value With these outer functions, the coefficients of the expansion satisfy the following unitarity constraints [5][6][7], where the symbols reflect the fact that the values for the χ factors in Table 10 are not exact, so the bounds are not precisely known. These constraints provide information from unitarity and analyticity, which can be used together with the output of the chiral-continuum extrapolation.
In this analysis, we do not impose the unitarity constraints, Eq. (71), on the BGL coefficients, but we check that the final results comply with the constraints within errors. We note that uncertainties in the values of χ T,L J P provided in Table 10 complicate strict unitarity constraints on the z expansion coefficients. Still, we find that an implementation of the unitarity constraints using hard cutoffs (following, for instance, Ref. [110]) leaves the fit results essentially unchanged.
There are two kinematic relations between the form factors, one at zero recoil and another at maximum recoil, These constraints follow trivially from the HQET basis of form factors (the h X ), but the BGL parametrization does not automatically impose them. 4 Equation (72) is straightforward to implement for the BGL parametrization with N = 1 in Eq. (55), since it amounts to a relationship between the b 0 and the c 0 coefficients of the expansion, On the other hand, Eq. (73) can be imposed by adding an extra data point that enforces the constraint. Alternatively, we can remove d 0 and write it as a function of the remaining d j and c j . The two approaches give compatible results. In our final value, we choose not to impose the second constraint. The results of the chiral-continuum extrapolation trivially build in both constraints, so we would expect any well-behaved expansion to keep this property, even when extrapolating to the whole recoil range. Indeed, the zero-recoil constraint is satisfied to very high accuracy, even if we do not impose it. This happens because the z expansion is quite constrained by the lattice-QCD values. For the maximum-recoil constraint, we just check for compatibility within errors.

Synthetic data
The output of the chiral-continuum extrapolation is not a set of points, but a set of functions that express the form factors at any value of the recoil parameter. In order to make the results amenable to a BGL fit with experimental data, we use synthetic data, together with their covariance, based on selected central values of the chiral-continuum fit, evaluated at zero lattice spacing and physical quark masses. The selected values with correlations are included in the ancillary files, as explained in Appendix D. We choose data points at three w values, {1.03, 1.10, 1.17}, as representative of the span of our lattice-QCD data, but we have checked that varying these values does not change significantly the curves generated for the form factors from the z expansion, as long as there are no w values too close to w = 1. This robust behavior is not surprising, since the full covariance matrix is available, and hence the same amount of information is provided. The covariance matrix of the lattice data is well defined for all recoil values, but at w = 1 the kinematic constraint given in Eq. (72) is exactly satisfied. Therefore, the form factors are no longer independent as w → 1, and the covariance matrix becomes singular. The singularity can be avoided by choosing any value slightly larger than w = 1. Also, we cannot push w much higher than 1.17 without going outside the region where lattice-QCD data are available, because the uncertainty grows rapidly. We have selected three points per form factor because in the continuum limit there are only twelve free independent functions in our chiral-continuum extrapolation, three per form factor. Adding more points does not increase  We first carry out BGL fits to our lattice-QCD form factors. The constant and linear coefficients are well determined by the data, and no prior constraints are used for them. The quadratic and cubic coefficents are constrained with priors 0(1) to stabilize the fit. Keeping terms up to quadratic (linear) order in z and imposing the kinematic relation given in Eq. (72), leaves just one (three) degree(s) of freedom. Since unitarity constrains the size of the coefficients, we can include cubic terms with the unitary-inspired priors to check stability of the results against truncation effects. Table 11 gathers the coefficients for these three versions of the z expansion, with the χ 2 /dof and unitarity sums. All fits satisfy the unitarity constraints within errors. The unitarity sums are computed by taking the median of the distributions obtained from the Gaussian posteriors, along with the confidence levels, ±34.1%, for the uncertainties. We note that the distributions of the squares are not Gaussian when the errors on the posteriors are large. This is the case for coefficients that are not well determined by the underlying lattice data. The kinematic constraint at z = 0, Eq. (72), is satisfied to very high accuracy, even when it is not imposed. On the other hand, the constraint at z Max , Eq. (73), is satisfied only within approximately one standard deviation, unless, of course, it is imposed.
The cubic coefficients (a 3 , b 3 , c 3 , d 3 ) shown in Table 11 are not well determined by our lattice data, resulting in unitarity sums, with central values > 1, while still consistent with unitarity within error. However, the cubic fit provides useful information on truncation effects. We see that the lower-order coefficients in the cubic fit are in very good agreement with the coefficients in the quadratic fit. Further, we find, that the cubic coefficients have little effect on the decay rate and the form factors, as expected, since |z| 1. In contrast, and as Table 11 shows, the linear expansion leads to coefficients with nearly the same central values as in the quadratic and cubic fits, but with slightly smaller errors, suggesting an underestimation of the truncation error. We conclude that the errors coming from the truncation of the series in Eqs.(63)-(66) are already included in the uncertainties on the coefficients of the quadratic fit, which we choose for the main result of the z expansion. The full correlation matrix of the quadratic fit, which can be used to reconstruct the output of the z expansion, is included in the ancillary files in binary format, as explained in Appendix D. Form factors from this information can be used in phenomenology with no assumption about the presence of new physics. Below we discuss z fits incorporating shape information from experiment, which are more precise but possibly contaminated by new physics in the light semileptonic channel.
Experimentalists [18,19] usually adjust the order of the z expansion to allow unconstrained fits to the form factors without violating unitarity. Following this criterion we find that we can remove the a 2 and the d 2 coefficients, and obtain the same fit results as in our quadratic fit without including any 0(1) priors for the higher order coefficients. This ensures that our utilization of priors for some coefficients indeed does not influence the fit results, and that their only function is to stabilize the fit.

Functional method
A functional method can also be used to fit the result of the chiral-continuum extrapolation to the BGL parametrization [39]. The method exploits the fact that the chiral-continuum fit functions are linear in the fit parameters. The covariance in the fit parameters is then easily converted to the covariance of the values of the resulting fitted form factors (at zero lattice spacing and physical quark masses) at any pair of recoil parameters (w, w ). Through the BGL parametrization, this covariance is then converted to a covariance in the form factors values at any z pair, (z, z ). The method has the esthetic property that information from the best fit continuum form factors is spread over the entire physical region in z, rather than at a few arbitrarily chosen discrete points.
We have compared form factors from the functional approach with those from the synthetic data. They show no discernible difference in the form factors, implying that the systematic errors associated with the choice of synthetic data from the chiral-continuum extrapolation are very small. Since the functional fits do not provide any new insight, and they make it difficult to combine data from several sources, we focus on the syntheticdata results in the rest of the paper.

Determination of |V cb |
The lattice-QCD form factors can be used in conjunction with experimental data to perform a joint fit to the BGL parametrization, with an additional fit parameter for the relative normalization, which is nothing but |V cb |. In these fits, the low-recoil behavior is determined by lattice QCD, and the large-recoil behavior by experiment. As experimental input, we use the 2018 raw dataset from Belle [18] and the synthetic data generated from the 2019 BaBar analysis [19]. These data are combined with the lattice-QCD synthetic data. We do not use Belle's 2017 tagged dataset [13], because it is still unpublished.
Experiments extract the fully differential decay rate, not only with respect to the recoil parameter, but also to all angular variables in the decay chain B → D * ν, D * → Dπ [14,18,112], where B(D * → Dπ) is the branching fraction of the daughter D * decay; further, θ v , θ , and χ are the polar angle of the D in the D * rest frame, the polar angle of the charged lepton in the rest frame of the virtual W meson, and the angle between the ν and Dπ planes, respectively. As with Eq. (7), for neutral B 0 decays the right-hand side of Eq. (75) should have an additional factor (1 + απ) for the Coulomb attraction in the final state (see for example, Refs. [113,114]). Other electromagnetic corrections are expected to be smaller, and we will neglect them, keeping only |η EW | 2 and the Coulomb factor. 5 Following our previous estimates of EM effects [22,53], as well as the HFLAV procedure [1,115], we use η EW = 1.0066(50) in our calculation. Belle marginalizes on one variable at a time, integrating (binning) the rest. BaBar's method consists of a full, four-dimensional analysis without integrating over any variable. The collaboration claims such an analysis is needed to achieve correct results [19]. Nonetheless, both the Belle and BaBar collaborations give com- 5 Both the Belle and BaBar experiments use the PHOTOS package to account for low-energy EM radiation; to the best of our knowledge, the EM interactions between charged particles in the final state, described by the Coulomb factor, are not included in PHOTOS.
For Belle, we integrate Eq. (75) in the required bins, and the BGL expressions are introduced in the integrated results. Also, we multiply the right-hand side of Eq. (75) by the Coulomb factor (1 + απ), because these data are for neutral B mesons only. We perform a combined fit to both the electron and muon modes, instead of averaging them. 6 For BaBar, we fit the lattice-QCD synthetic data with those in Ref. [19]. BaBar publishes results for a BGL fit to their data that includes both neutral and charged B-meson decays. According to Ref. [117], 35.1% of the decays in this data set correspond to B 0 decays, and the remaining 64.9% correspond to B ± decays. These fractions imply that a Coulomb factor of (1 + 0.351απ) should be applied to the B 0 + B ± BaBar dataset.
BaBar uses the previous Fermilab-MILC result of h A1 at zero recoil to extract |V cb | [19,22]. In order to minimize the influence of the lattice-QCD results for h A1 (1) from Ref. [22] in the joint fit, we create synthetic data from Ref. [19] for |η EW | 2 |V cb | 2 |F(w)| 2 at five recoil values away from w = 1, as shown in Fig. 9. Since the BaBar collaboration employs a linear fit for all form factors, we exhaust the number of degrees of freedom with just five points. Each one of these synthetic data points has a larger impact than each single data point from Belle's untagged dataset, because the Babar synthetic data inherits its precision from the precision of the underlying, full dataset. In the end, the green band for Belle in Fig. 9 is noticeably narrower than BaBar's, so it is expected that the Belle untagged dataset has a larger impact in the final results for |V cb | and R(D * ).
The kinematic constraint in Eq. (72) is included in these fits, and although there are no direct experimental measurements that determine F 2 , experiments also have an impact on the d j coefficients through the correlations between this and other form factors. Our preferred fits, coming from quadratic z expansions, are shown in Fig. 9 and Table 12. As in the lattice-QCDonly fits, the fits in Table 12 include 0(1) priors for the quadratic and higher (if applicable) coefficients of each form factor, while leaving the rest of the coefficients unconstrained. Fig. 9 shows that the mean of the lattice estimate falls below the experimental curves, but the errors are large enough to make the difference remain at ≈ 2σ. The correlations between the different lattice, synthetic data points determine very precisely the slope of the decay amplitude, forcing it to be noticeably larger than what we obtain from our fits to experimental data. The full correlation matrix is provided in the ancillary files, as described in Appendix D. Form factors from this information can be used in phenomenology, under the assumption that only the τ couples to new physics.
Our final result for |V cb | is obtained from the quadratic BGL fit to the lattice-QCD form factors and both experimental datasets (see the column labeled "Lattice+both" in Table 12), which yields and a χ 2 /dof = 126/84. This relatively large χ 2 /dof indicates tensions among the datasets: a combined fit of Belle and BaBar data, using lattice-QCD input only for normalization results in a large χ 2 /dof of 104/76. It is therefore to be expected that the combined fit would result in a similarly large χ 2 /dof. Further, we note that our fit to the lattice-QCD form factors only has χ 2 /dof < 1, as shown in Table 12, which also lists the results of joint fits of lattice-QCD form factors with each experimental dataset separately, as well as with the combined Belle and BaBar data. We find that all joint lattice-QCD with experimental data fits have χ 2 /dof > 1, including the one leading to Eq. (76), but the central values of |V cb | do not differ more than approximately one standard deviation among these fits, and the sizes of the errors are similar. We also see a general agreement in the coefficients of the expansion, particularly in the important low-order ones. Because most previous inclusive and exclusive determinations of |V cb | omit the Coulomb factor, we also perform the BGL fits without it; the results are collected in Table 13. Compared to the results for |V cb | in Table 12, the central values are shifted by the respective Coulomb factors. They are consistent with previous exclusive determinations, for example |V cb | excl = (39.9 ± 0.9) × 10 −3 from the PDG [2]. The long-standing tension with inclusive determinations thus remains: |V cb | incl = (42.2 ± 0.8) × 10 −3 [2].
Since the Belle data are binned in different variables, there is a normalization constraint between the different bins, assuming that they contain the same underlying data. Then only 37 of the 40 bins are truly independent for each mode [116], because the sum of all bins for a particular variable should give the same total number of events. Such constraints should be reflected as zero eigenmodes, or -with rounding errors-very small eigenvalues in the 40 × 40 statistical correlation matrices. The correlation matrices provided in Ref. [18] are constructed using Monte Carlo simulations, and do not resolve these constraints due to the underlying approximations. We therefore investigate the effect of removing the last bin on each one of the angular variables data and reconstructing its value from the total normalization. We find that this procedure correctly introduces the anticipated constraints between the bins, while the values of the reconstructed last bins are compatible with those given Ref. [18]. Hence, the expected zero eigenvalues in the statistical correlation matrices are recovered. With this procedure our combined Belle + lattice-QCD BGL fit does not yield any significant changes in the final values and uncertainties for |V cb | and R(D * ), but we observe a substantial decrease in χ 2 /dof from 111/79 to 96/73. In the case of our joint fit, which includes Belle, BaBar and lattice-QCD data, the χ 2 /dof decreases from 126/84 to 109/78. Nevertheless, the results for |V cb | and R(D * ) quoted in this work use the Belle data and correlation matrices as given in Ref. [18].
The BGL fit to the BaBar data [19] includes fewer coefficients than our BGL fit to the lattice-QCD form factors. We test for the presence of truncation errors by performing BGL fits to the BaBar data including higher coefficients with priors of 0.0(5). This increases the errors in the BaBar data points, most likely because the extra coefficients are completely uncorrelated with the rest of the BaBar data. Because the joint fit to all data

Joint Fit
Lattice×V cb Belle untagged, e − Belle untagged, µ − BaBar synthetic Fig. 9 Results for separate fits to each dataset (left) and joint fit of all data (right). On the left we compare the BaBar result (gray), the Belle result from the untagged dataset (green), and the lattice-QCD result coming from our synthetic data (red).
To allow for an straightforward comparison of lattice and experimental data, the data points and bands have been normalized with the central value of |V cb | as obtained in our joint fit, and taking into account the Coulomb factor corresponding to each case. All results agree within ≈ 2σ over the whole kinematic range. There is tension between the slope predicted by the lattice calculation and that of the experimental data. Since the lattice-QCD slope is well determined, correlations in the joint fit cause the central lattice-QCD values to fall slightly below the experimental values.
is currently dominated by the Belle and lattice-QCD data, the addition of extra coefficients in the BaBar expansion does not change our final results for |V cb | and R(D * ) in a meaningful way. Hence, for the our final results quoted in this work, the synthetic data points from Ref. [19] are generated without adding extra coefficients.
In the Belle and BaBar analyses the number of coefficients in the BGL z expansion is limited to exclude those that cannot be properly determined by their data, and thus avoiding apparent unitarity violations. This procedure, however, does not account for possible truncation errors. Repeating the fits in Table 12 without the c 3 and d 2 coefficients yields similar results with reduced errors and much smaller sums for the unitarity constraints.

Determination of R(D * )
From the fit results in Table 12 we can calculate R(D * ) through direct integration of the differential decay rate over the whole kinematic range. In Fig. 10, we show the differential decay rate as a function of the recoil parameter extracted using lattice-only data (red and brown curves), compared with that of our joint fit. The curves below (maroon and blue) show the differential decay rate for the τ case. Our final result for R(D * ) from our purely lattice-QCD calculation is If we assume that new physics effects are visible only at large lepton masses (i.e., the τ ), we can use our joint fit of the lattice and light-lepton experimental data to obtain a more precise SM value of R(D * ). We note that in our joint fit, the curve corresponding to light leptons is determined mainly from experiment, and the one corresponding to the τ comes mainly from the lattice data.
In that case, we obtain where the Coulomb factor is included. Its removal does not change significantly neither the central value nor the error. We emphasize, however, that Eq. (77) is the SM prediction, relying only on lattice QCD, while Eq. (78) is also based on the shape information coming from experimental data. In any case, the correlated difference between the two results is 1.3σ. Our values also agree with previous theoretical determinations [20,21,[118][119][120]. We note that more recent experimental measurements have found R(D * ) to be consistently smaller than before, hence reducing the tension between theory and experiment [1]. The current status of the R(D)-R(D * ) determinations is summarized in Fig. 11. Table 12 Quadratic z expansion results. The second column shows results from a fit only to synthetic lattice-QCD data (the same as the "quadratic" column in Table 11), the third, from a joint fit to lattice QCD plus BaBar's synthetic data, the fourth, from lattice QCD plus Belle's untagged dataset, the fifth, lattice QCD plus both experiments, and the last, a combined fit of all experimental data using the value of h A1 (1) extracted from our chiral-continuum extrapolation as normalization. The coefficient c 0 is fixed by the constraint given in Eq. (72), and it is shown for convenience.

Imposing the constraint at maximum recoil
As we explained above, our preferred analysis does not impose the kinematic constraint in Eq. (73), is trivially satisfied in the HQET basis of form factors (the h X ) used in our chiral-continuum extrapolation. However, the BGL expansion does not naturally incorporate it. Maximum recoil is far from the region where lattice data are available, and there are no experimental data available for this decay with a heavy lepton = τ . Thus, to the extent that the BGL expansion does not match the HQET-basis form factors precisely, we expect small deviations from Eq. (73) in the BGL fit. Such deviations are tolerable because small violations of the constraint do not have any physical consequences, as long as they are within errors. Figure 10 shows that our fits, nonetheless, satisfy the maximum-recoil constraint to within approximately 1σ.
Imposing the constraint in the fit model, we find new values for |V cb | = 38.36(78) × 10 −3 and R(D * ) = 0.274 (10), which are compatible with the values obtained in our preferred analysis. It is not surprising that the constraint does not alter the value of |V cb |. After all, the CKM matrix element is extracted mainly from the behavior of the form factors at small recoil and does not entail the form factor F 2 . The error on R(D * ), on the other hand, is slightly reduced by the constraint.
A comparison of the BGL coefficients of the constrained analysis with those of our preferred one is shown in Table 14. We do not find significant changes, and the coefficients in both analyses are compatible with each other within 1σ, although the differences increase with the order of the coefficients. This behavior is expected, as the low-order coefficients are well determined by the data, and the higher-order coefficients become more relevant at maximum recoil. The new information does not improve the quality of the fit in a significant way. This is particularly clear in the pure lattice fit, where the χ 2  (14) calculated from lattice-QCD data [53] (without correlations, which are not available but could be important). The green point shows R(D * ) from the joint fit yielding |V cb | with R(D) = 0.299(3) from HFLAV [1], which is similarly based on a joint fit to lattice-QCD and experimental data.
almost doubles, but the number of degrees of freedom increases just from three in the unconstrained fit to four in the constrained one. It appears that the constraint introduces small tensions with the BGL expansion. Because of this, and since both the unconstrained and the constrained fit give compatible results with only small differences, we choose the unconstrained fit as our preferred result.

The z expansion with an improved CLN parametrization
For the sake of completeness, we offer an alternative analysis, replacing the BGL parametrization with CLN.
In the CLN parametrization, the form factor h A1 is expressed as a polynomial in z, and the other form factors appear as ratios with respect to h A1 : We include a few improvements to address the weak points of CLN: first we extract the full covariance ma- Table 14 Comparison of results of the z expansion fits with and without the kinematic constraint given in Eq. (73). The largest differences appear in the higher-order coefficients. The coefficient c 0 is fixed by the constraint given in Eq. (72), and it is shown for convenience. The Coulomb factors are included in the fits with experimental data input.  (12) trix relating the parameters ρ A1 and c A1 from the original article [8] using the data given for the one-sigma ellipsoids. With the full covariance matrix, we can account for the strong correlations between ρ 2 A1 , c A1 and d A1 , which allows for small variations of the fixed relations often used in CLN fits. Second, we use updated results for the expansions in w − 1 of the ratios R j . The form factors to be fit are where to fit R 0 to experimental data requires measurements of the τ final state. The full correlation matrix of ρ 2 A1 and c A1 is given in Table 15, and we follow Ref. [8] to calculate d A1 . In the following, we refer to the CLN parametrization as "improved" when using Eqs. (82)- (85) and "base" when using the coefficients of the original paper [8].
Fitting our lattice data with the improved CLN parametrization yields a result for R(D * ) that is com- In light of these issues, the BGL parametrization provides a much superior z expansion, so we have chosen it in our main analysis. 7 One can wonder why the improved CLN fit performs so poorly. One possible point of tension between our data and the improved CLN ansatz is the relationship between the slope, the curvature and the cubic coeffi- Contour plot of the slope (ρ 2 A1 ) versus curvature (c A1 ) of the h A1 form factor in the CLN and BGL cases. We show the ellipsoids corresponding to the CLN priors extracted from [8] and the results of the improved CLN and BGL fits at one and two sigmas. There is good agreement in all cases, and a numerical test shows that the differences in ρ 2 A1 , c A1 and d A1 between the fit results of the improved CLN parametrization and the BGL parametrization is close to 1σ. cient in h A1 , which is much more constrained than in the BGL parametrization. We compare our fit results to only lattice data for both parametrizations by calculating a Taylor expansion around z = 0 up to cubic order of our BGL result for h A1 ∝ f , and we present in Fig. 12 contour plots of the CLN priors, the improved CLN fit result, and the BGL fit result. Our BGL results for ρ 2 A1 , c A1 and d A1 are compatible with our improved CLN results within one sigma, suggesting that the tensions with the improved CLN parametrization come from the other form factors. Figures 13 and 14 show the results for h A1 and R 0,1,2 in the BGL and the improved CLN cases, along with the lattice data used in the fit. In general, the BGL fit shows a much better agreement with our lattice data, and the ratios R 0,2 are poorly fitted with the improved CLN ansatz. Also, the improved CLN fit violates the constraint given by Eq. (73). Imposing the constraint does not improve the fit quality, or reduce the tensions in the R 0,2 ratios. Following Ref. [108], we advocate either a revision or a deprecation of the CLN parametrization in favor of a truly model-independent parametrization, such as BGL. Even if we had found a good quality of fit with CLN, we would still prefer the BGL results on theo- With the latter we are able to verify ex post facto the reliability of these constraints. Other limitations of the base CLN, such as an update of its inputs, and a more careful treatment of the errors and correlations of the CLN coefficients, have been addressed in a few recent papers [119][120][121]. The HQET parametrization discussed in those works also include corrections of order 1/m 2 c . We refer the reader to those papers for further discussion.

Comparison with LCSR
We can also test the validity of the light cone sum rules (LCSR), often employed to constrain the form factors at maximum recoil. To this end, we take the latest results from Ref. [122]. They present results for the form factors in a different notation. For the reader's convenience, we provide the conversion formulae: A 2 (w Max ) = 0.51 (9), Our lattice-QCD only results for the aforementioned form factors are: A 2 (w Max ) =0.71 (11).
The agreement is excellent for A 1 and V , and the A 2 form factor also agrees within 1.4σ.

Discussion and outlook
Using the first unquenched lattice-QCD calculation of the form factors describing the decay B → D * ν at nonzero recoil, together with 2018 Belle [18] and 2019 BaBar [19] measurements, we obtain the following results for the CKM matrix element |V cb |: which includes the Coulomb correction for neutral Bmeson decays. Omitting this correction leads to |V cb | = (38.74 ± 0.78) × 10 −3 , which is the result that should be compared with inclusive decays. The quoted uncertainty is from experiment and theory together. As discussed in Sec. 5.2, this result comes from a fit exhibiting tension among the datasets, χ 2 /dof = 126/84. The tension is motivation for better experimental measurements and lattice-QCD calculations, as a ∼ 2σ effect it is inconclusive. In order to disentangle the contributions from experiment and theory, we run a BGL fit as described in Sec. 5.1, but assuming very small errors on the synthetic lattice-QCD data, and then we run analogous fits assuming very small experimental errors. By looking at how the final error changes, we estimate each contribution, finding 0.34×10 −3 from experiment, 0.67×10 −3 from lattice QCD, 0.10×10 −3 from the truncation of the BGL expansion, and 0.18 × 10 −13 due to EW+EM effects. These partial values are an approximation to guide how much improvement we can expect from future calculations and experiments. The final error in Eq. (92) is an accurate estimate of the full uncertainty.
Belle II is expected to deliver experimental data for this decay, although until there is a better understanding of its detector performance and the systematics of the experiment, it is not clear how much improvement on the error for |V cb | could come from these high statistics data [123]. An improved result can be obtained from this or any other forthcoming data, in combination with the synthetic data points and covariance matrix that fully describe the output of our chiral-continuum extrapolation, given in the ancillary files as described in Appendix D.
The main result of this article is the behavior of the form factors parametrizing semileptonic B → D * ν decays at small, but nonzero recoil. It allows us to perform a much more robust analysis of |V cb | than when using just the value of the decay amplitude at zero recoil. We find excellent agreement between our current and previous results, both from B → D * ν at zero recoil [22] and B → D ν at nonzero recoil [53]. Our result also agrees with other recent B → D ( * ) ν exclusive determinations [32,124]. It is also compatible with the recent extraction based on B s → D ( * ) s ν decays measured by LHCb [125], and with form factors recently calculated by HPQCD [23,126], but with much smaller errors.
Our result for |V cb | does not change the current status of the inclusive-exclusive puzzle. The inclusive determination was recently updated in [127], slightly reducing the uncertainty, and preliminary results from Belle based on a data-driven approach [128], and compatible with the current inclusive world average [1], have been presented [129]. On the other hand, it has been argued that it is very challenging for BSM physics to accommodate such a tension [3,4]. A further reduction in errors is necessary to extract conclusions. Inclusive calculations of |V cb | might also benefit in the longer term from recent ideas in lattice QCD [130,131], which might lead to calculations with very different systematics.
During the implementation of the joint experimentallattice fits we found issues in both experimental datasets used: the BaBar collaboration does not provide unfolded data, and their z expansion uses only five coefficients to describe three form factors, as opposed to the nine coefficients we use for the same form factors in our lattice data only fit. As a result, we are concerned that the truncation errors in the z expansion could be underestimated in the synthetic data generated from BaBar fits. However, the |V cb | and R(D * ) fits are dominated by the Belle and lattice-QCD data, and the effect of including BaBar data is just a small reduction in the total error in Eq. (92). The Belle collaboration provides unfolded data, but as pointed out in Ref. [116], the statistical correlation matrices given in Ref. [18] seem inconsistent. We also checked that this potential problem has no significant impact on our results for either |V cb | or R(D * ) (with form factors from the |V cb | fit). Thus, an improvement in the presentation of the data from both collaborations would be very welcome.
Another benefit of the knowledge of all form factors at nonzero recoil is the possibility of calculating R(D * ) from first principles. Our result reaches a similar precision to that of the B → D ν analysis for R(D). Even though our calculation of R(D * ) involves integrals of extrapolated quantities with large errors, the combined error is relatively small due to the large correlations between B(B → D * τ ν) and B(B → D * ν). In this case, the form factor that enter only in the determination of the decay to a τ was computed with lattice input only. We have also calculated a more precise value using form factors from the |V cb | fit, R(D * ) Lat+Exp = 0.2484 (13). Our preferred SM value is the one given in Eq. (93), which comes exclusively from lattice QCD and avoids any experimental decay-rate input.
The result in Eq. (93) confirms previous theoretical estimates of R(D * ), as well as the current tension between the SM and experiment in the R(D)-R(D * ) plane. Recent experimental determinations of R(D * ) tend to reduce the tension, however. In fact, before Belle published results from their untagged dataset [29], the tension was as large as 4σ, but the newest analysis has reduced it to 3σ, and remaining tensions come mainly because of the influence of the BaBar R(D) result [132]. An updated measurement of R(D) could cast some light on the current tensions. Also, future high-precision experimental measurements from Belle II and LHCb are bound to become critical to determine whether these quantities will agree with the SM in the end.
Together with more precise experimental measurements, lattice-QCD form factors with a smaller uncertainty will also be crucial for shedding light to this theory-experiment tension, as well as to the exclusiveinclusive tension in the determination of |V cb |. We expect to reduce the uncertainties in the form factors at nonzero recoil in future work, the first of which is already in progress at the time of this publication. The main sources of uncertainty in this work come from statistics and the quark discretization errors. An improvement in this area will require a modification in both the light and the heavy-quark actions to allow for smaller systematics. Chiral-continuum-extrapolation errors can be reduced by using a better discretization for light quarks and physical pion masses. Another area where we can reduce errors is the renormalization, by using a nonperturbative calculation of the renormalization factors. Further validation of our lattice-QCD result will come with independent analyses currently in progress by other lattice-QCD collaborations [133]. Similar improvements, with an expected reduction of errors, can be applied to our calculation of B → D ν form factors at nonzero recoil [53]. A correlated analysis of both decays will allow a correlated determination of R(D) and R(D * ) that could provide tighter theoretical constraints.
In this work, we also assess the impact of using an improved CLN parametrization to describe the shape of the form factors. Instead of using the fixed coefficients published in Ref. [8], we employ the full covariance matrix that relates the slope and curvature coefficients of the reference form factor using data from the original CLN paper [8], and pass this information as priors to a CLN fit. We also compute the errors in the cubic coefficient, and update the values of the ratios with respect to the reference form factor with values coming from one of the latest HQET calculations [15], assuming a 20% error on each coefficient. Our updated CLN parametrization gives a very similar central value and error bar, compared with that of the BGL parametrization, but the quality of fit decreases greatly when the lattice-QCD data are included. CLN is very restrictive with the shape of certain form factors, and because the lattice-QCD data have relatively small errors, they introduce serious constraints in the parametrization. Our findings reinforce the current consensus of the community [108] to abandon CLN in favor of the more flexible and rigorous BGL parametrization. The impact of using improved HQE parametrizations, such as the one in Refs. [119][120][121], should be nevertheless investigated. form factors for the sake of completeness, In the expression above Y = A 1,2,3 , and for the vector form factor logs V SU(3) = logs A1 SU (3) . The index Θ goes over all pseudoscalar meson fields of the effective theory, whereas Ξ labels vector and axial counterparts. Explicit expressions for the mass of a pseudoscalar meson P with taste Ξ, m P Ξ , in terms of the parameters of the rooted staggered theory can be found in Ref. [135]. The hairpin parameters, which come from χPT disconnected diagrams, are marked as δ V,A . Both vector and axial hairpin parameters are defined for the lightest coarse ensemble a ≈ 0.12fm to be δ A = −0.28 (6) and δ V = 0.00 (7), and their value is obtained for other ensembles by rescaling this number, assuming the hairpin parameter scales as the root mean square of the taste splittings. The taste splittings along with the tree-level LEC B 0 depend solely on the lattice spacing and are given in Table 16. The values for the hairpin parameters, as well as those quoted in Table 16, were determined from fits to light-quark quantities by the MILC collaboration in Ref. [54]. Our results are insensitive to the errors in the parameters quoted in the table, as well as the errors in the hairpin parameters, since the chiral logs are a very small contribution to our fit function in the region where we have data.
The functions appearing in Eq. (A.1) are given bȳ where ∆ (c) is the D-D * mass splitting that we take from the PDG [2], assuming a charged meson. The error from the PDG uncertainty, even if enlarged to cover a neutral-meson mass difference, also has a negligible impact in the final fit. The F Y functions are defined as The functions E j (w) and G j (w) are Here r(w) is The F j functions in general cannot be expressed in closed form, and are defined as integrals, where a = x cos θ √ 1 + w sin 2θ . (A.14) Appendix B: Estimation of the matching factors and their errors In this Appendix, we derive the renormalization factors for the vector and axial-vector currents using heavyquark effective theory (HQET) as an intermediary between lattice gauge theory and continuum QCD [50,103]. We have calculated the renormalization factors at one loop in perturbation theory, sometimes using an expedient approximation described below. These one-loop results are shown in Table 6. Here, we use the notation
= means "has the same matrix elements as". The currents on the left-hand side are lattice operators, while the bilinears on the right-hand side are HQET operators (e.g., built from fields satisfying v /b v = ib v ). Similarly, HQET can be used to describe continuum-QCD currents.
The only difference between the lattice and the continuum currents lies in the short-distance coefficients; for Wilson-like fermions, as used here, LGT J Eqs. (B.15)-(B.18) are valid at zeroth order in 1/m Q , Q = c, b, which suffices here. 8 To continue with the analysis, it is advantageous to write explicitly the velocity and polarization vectors of 8 Matching at the next order in 1/m Q is possible in principle but cumbersome beyond the tree level. our mesons. Assuming the B meson to be at rest, and the D * to have momentum p along the z axis: where ± are two-component unit vectors in the xy plane. For momenta in other directions, such as v ∝ (1, 1, 0) or (1, −1, 1), one can rotate the spatial components accordingly. Note that we pickv ⊥ , and then we deduce ± . In this notation, v · v = −1, v · v = −1, and v · v = −w. 9 The polarization vectors satisfy¯ m · n = g mn for (m, n) ∈ {+, −, 3, s} with g mn = diag(1, 1, 1, −1). The bar on a polarization vector means to complexconjugate the spatial components, which arises if (as usual) ± corresponds to D * helicity ±1. (We use linear polarizations with real but keep the ± notation.) The polarization vectors satisfy v · ±,3 = 0, and also v· ± = 0.
These vectors can be used to isolate parts of the currents with different matching factors: and similarly for V (without the superscript " LGT") and for A and A withc v →c v γ 5 . In HQET, the matrix elements can be worked out with the "trace formalism" [43]: where ξ(w) is a form factor known as the "Isgur-Wise function", and These conventions also hold in Minkowski space when using the metric g µν = diag(−1, 1, 1, 1) [103]. NB: in Minkowski space, µ ∈ {0, 1, 2, 3}; in Euclidean space, g µν = δ µν and µ ∈ {1, 2, 3, 4}; x 4 = ix 0 . and similarly forM D and M D * . Then With these results, physical combinations of the decompositions in Eqs. (B.15)-(B.18) can be more easily tracked. Note that the sign convention for ε cancels in ratios introduced below.
These expressions can be used to relate matrix elements of continuum and LGT currents to each other, via HQET and Eqs. (B.15)-(B.18). Thus, Z V V and Z A A have the same matrix elements as V and A if one chooses matching factors such that From these requirements one finds In anticipation of one-loop perturbative calculations, it is convenient to definē ρ J (w) =Z J (w) Z  which require, respectively, the matching factors Note that the form factor A 1 (q 2 ) ∝ h A1 (w), defined in Eq. (87), comes directly from R A1 Q A1 (w), and V (q 2 ) ∝ h V (w), defined in Eq. (86), comes directly from R A1 Q A1 (w)X V (w), while the helicity amplitudes H 0 and H s are linear combinations of R A1 Q A1 (w)X 0 (w) and R A1 Q A1 (w)X 1 (w). The helicity amplitudes H ± come from A 1 and V , but it is presumably more convenient (or just as convenient) to keep the axial and vector parts separate.
Using the trace formalism, it is easy to show Taking the ratio, the form factor drops out: LGT V ⊥ (w) + (w + 1)C LGT , but a matching factor remains. 10 The analogous equations for V shows that f 1 (w) =C V (w)ξ(w) andC V (w) = C V ⊥ (w) + (w + 1)C V v (w).

Appendix B.3: Expedient Approximation
Calculating the full w dependence of theZ J at the one-loop level is, in general, very cumbersome. From Ref. [50], however, we have LGT Then we can approximate, where ρ [1] max = 0.352 is the largest one-loop coefficient that we find among the computable one-loop coefficients. For lack of a better choice, we set q * = 2/a, as in other papers.
In the limit m c a → 0, the matching factor in the velocity tends to 1. We could use an uncertainty like that in Eq. (B.49c), but that seems to be an unecessary complication. A little algebra shows that the mismatch in w is very small: w ≈ w LGT ± α V (q * )ρ [1] max (w 2 − 1)m 2c a, (B.50) while the mismatch in z is (B.51) Table 17 Simulation values κ c and κ b employed in the calculation of the form factors compared with their more precisely tuned values κc and κ b [22]. The first error in κ b and κc includes statistical and fitting contributions; the second one comes from scale setting.  (16)(7) 0.3117 (22) fm [64], we tune the heavy-quark masses so that the kinetic masses M 2 of the B s and D s mesons agree with their experimental values [64,70]. Historically, the tuning was done in two stages. A preliminary, lower-statistics study set the heavy-quark masses used in the simulation to generate the two-and three-point correlators in the present study. A subsequent, higherstatistics study was then possible, and it gave slightly different kappa values, as shown in Table 17. We denote the simulation values with κ c and κ b and the refined values with κ c and κ b . As in Ref. [22], we proceed to correct our results for the slight mistuning and, in the process, estimate the systematic error associated with the correction.
The correction procedure is based on a single a ≈ 0.12 fm ensemble. A full set of two-and three-point correlation functions were calculated at slightly shifted values of κ b and κ c , as shown in Table 18. The shifted values were chosen close to both the corrected and simulation kappa values. With these data, we then estimate the derivatives of the form factors and the recoil parameter with respect to the heavy quark masses. By expressing these results in dimensionless terms, we can extrapolate them to other ensembles with different light-quark masses and lattice spacings. As a departure from Ref. [53], we perform the correction after the renormalization factors have been applied.
The first step is to construct the combinations of ratios we need to correct. Clever combinations can remove a dependence on the recoil parameter or vanish at zero recoil. We can use this information to improve the precision of the corrections. In this case, we define Table 18 Ensemble and kappa parameters used to calculate the mistuning correction [22]. The values in bold are the uncorrected simulation values for this ensemble. Not all combinations are available, since we change only one quark mass at a time while keeping the other quark mass fixed to its uncorrected value. a (fm) am l /ams Available κ b Available κ c ≈ 0.12 0.01/0.05 0.0901, 0.860, 0.820 0.1254, 0.1280, 0.1230 the following quantities: At zero recoil we expect C 1 → 1, since X 1 → 1 in that limit. Although x f vanishes when w → 1, the combinations are designed to give a finite value at zero recoil. In fact, it is easy to reconstruct the form factors using these building blocks, The last quantity corrected is the recoil parameter, defined as in Eq. (21), with a trivial dependence on κ b . We estimate the derivative by taking finite differences between the observables in Eqs. (C.52)-(C.56) calculated in the standard run and one of the correction runs listed in Table 18. The derivative is taken with respect to ξ α = 1/am 2α with α = b, c, where the kinetic mass am 2α is defined as 1 am 2α = 2 am 0α (2 + am 0α ) + 1 1 + am 0α , (C. 61) and the bare quark mass is calculated using the tadpoleimproved, tree-level formula Here u 0 is the tadpole parameter, and κ cr is the value of κ at which the clover-quark pion becomes massless. We calculate the derivatives of each observable for each available value of the recoil parameter. When computing the correction for the b quark, since the κ b = 0.860 simulation value is so close to the tuned value of κ b , we do not use the data coming from κ b = 0.820. In general, the calculated derivative is small for all recoil values, and the errors are large enough that we can consider a linear dependence on w −1 for all derivatives. Nonetheless, we can exploit some good properties of our observables in order to reduce the number of free parameters. For instance, dC 1 /dξ α (w = 1) = 0, since C 1 = 1 in that limit, so we can set the constant term to zero. Also, the derivatives of V, B 0,1 change very little with w − 1 compared with their errors. Thus, we can safely assume these derivatives behave as constants in the small recoil range we are considering. In fact, these quantities are derived from ratios of form factors h X /h A1 , which should depend only weakly on w. These simplifications are not only welcome, they are necessary since for V, B 0,1 , C 1 we have only two values of the recoil parameter available, so our fits can have one degree of freedom. The observable A is not a ratio per se, but it is calculated at zero recoil as well, and a clear dependence with w arises. Finally, it is obvious that dw/dξ c (w = 1) = 0, and we can apply the same treatment as for C 1 . Summarizing, we fit our data to the expressions, All fits are unconstrained and result in a p value exceeding 0.5. Our final corrections are gathered in Table 19. These values are used to correct the form factors for all ensembles. As an example, we show in Table 20 how this tuning modifies the calculated form factors for the ensemble used in the correction. Table 19 Results of the κ correction fits.  Effect of the mistuning adjustments in the form factors and the recoil parameter of the a ≈ 0.12 fm ensemble used to compute the correction. Only the final error is shown in the table. The error for w seems to decrease with the correction, but the relevant quantity is w − 1, for which the error either stays the same or increases. The correction (and its error) is highly correlated with w − 1, as it is directly proportional to it. The z-fit results are stored in the file FitResults.PyDat and can be read following the same procedure. The dictionary keys in this case are in the format FIT xj, where FIT can be either LQCD for the lattice-QCD-only fit, or LQCDEXP for the fit including lattice-QCD and experimental data, and xj = a0, a1. . . are the different coefficients of the z expansion a 0 , a 1 . . . Our final result for |η EW | 2 |V cb | 2 is stored under the key LQCDEXP eVcb, and the results for R(D * ) are stored as LQCD RDst and LQCDEXP RDst.