A closer look at the extraction of $|V_{ub}|$ from $B\to\pi l\nu$

To extract the Cabibbo-Kobayashi-Maskawa (CKM) matrix element $|V_{ub}|$, we have re-analyzed all the available inputs (data and theory) on the $B\to\pi l\nu$ decays including the newly available inputs on the form-factors from light cone sum rule (LCSR) approach. We have reproduced and compared the results with the procedure taken up by the Heavy Flavor Averaging Group (HFLAV), while commenting on the effect of outliers on the fits. After removing the outliers and creating a comparable group of data-sets, we mention a few scenarios in the extraction of $|V_{ub}|$. In all those scenarios, the extracted values of $|V_{ub}|$ are higher than that obtained by HFLAV. Our best results for $|V_{ub}|^{exc.}$ are $(3.88 \pm 0.13)\times 10^{-3}$ and $(3.87 \pm 0.13)\times 10^{-3}$ in frequentist and Bayesian approaches, respectively, which are consistent with that extracted from inclusive decays $|V_{ub}|^{inc}$ within $1~\sigma$ confidence interval.

|V ub |; see ref. [10]. This is a reasonably good fit with p-value ∼ 47%. The available data and the average q 2 -spectrum as obtained by HFLAV and the second fit results have been summarized in section 6.3.1 of [10]. After repeating a similar fit mentioned above to obtain the average q 2 spectrum, we have arrived at an even worse quality of fit with a p value < 1%, though we have used the same experimental information as HFLAV. In any case, a frequentist fit of probability < ∼ 5% is usually considered to be of negligible significance and any further fit (in the second stage), using the outcome of this very low-significance fit is bound to churn out biased predictions for |V ub |. It thus becomes essential to reconsider other possible ways of analyzing the available data and pin-point the source of tension in the fits. We discuss the details in the next section. Subsequently, we have utilized the newly available inputs on the form-factors from LCSR for non-zero values of q 2 in our fits.
A. Motivation and a few observations

Theoretical Background
The differential decay width w.r.t. q 2 for a pseudoscalar to pseudoscalar semileptonic decay is a function of f +,0 (q 2 ). In particular, forB 0 → π + semileptonic transitions, we have 1 : where p π (m B , m π , q 2 ) = λ(m B , m π , q 2 )/2m B with λ(m B , m π , q 2 ) = ((m B − m π ) 2 − q 2 )((m B + m π ) 2 − q 2 ). Therefore, to extract |V ub |, we need information on the form-factors at different values of q 2 . There are non-perturbative techniques like lattice-QCD and LCSR which can provide the values of the form-factors at high and low values of q 2 , respectively. At present the lattice estimates are available on f +/0 (q 2 ) at zero and non-zero recoils [12][13][14]. There is also a recent update on the values of these form-factors at q 2 = 0 and at values other than q 2 = 0 [15]. So far, inputs from lattice-QCD, experimentally measured values of the decay rates, and LCSR input at q 2 = 0 have been utilized for extracting |V ub |. Here, the major sources of uncertainties in the extractions of |V ub | are the form factors. To get the shape of the decay rate distribution, one needs to know the shape of the corresponding form-factors in the whole q 2 region. Therefore, it is crucial to have a parametrization of f +/0 (q 2 ) that satisfies real analyticity in the complex q 2 plane. For the form-factor parametrization, we have followed two different approaches which are known as Bourrely-Caprini-Lellouch (BCL) [16] and Bharucha-Straub-Zwicky (BSZ) [17] parametrization and compared their results.
According to BCL, f + and f 0 are as follows: Here, b 0/+ n are the coefficients of the expansion which are free parameters and they obey the unitarity constraint 1 The corresponding charged B will decay to a neutral pion and hence will be scaled by a factor of 1/2 at the decay width level since where the element B mn satisfies B mn = B nm = B 0|m−n| , the details for which can be seen from [13], [16]. The conformal map from q 2 to z is given by : where t ± ≡ (m B ± m π ) 2 and t 0 ≡ t + (1 − 1 − t − /t + ). t 0 is a free parameter that governs the size of z in the semileptonic phase space. For BSZ, the parametrization of any form-factor reads: where m R,i denotes the mass of sub-threshold resonances compatible with the quantum numbers of the respective form factors and a i k s are the coefficients of expansion. The details are provided in [17]. The simplified series expansion discussed above is the model-independent description of the form factor and based on analyticity arguments. The z-expansion is a conformal mapping that maps the kinematically allowed region within a disc of radius |z| < 1. Only a few coefficients are needed to represent the form factor accurately. Furthermore, it provides a prescription for introducing more parameters with the improvement of data. The BCL expansion obeys the known asymptotic behaviour near the Bπ threshold: Im(f + (q 2 )) ∼ (q 2 − t + ) 3/2 . Therefore, at q 2 = t + (z = −1), the derivative of the form factor must satisfy BCL uses this constraint to remove an independent degree of freedom from the series expansion in z and arrives at a formula for f + given in eq. 4. For the scalar form factor f 0 or it's derivative there are no such constraints available at any value of z, so we can not remove a further degree of freedom in the series expansion of f 0 (z). In contrast, the BSZ form factor parametrization does not obey the known asymptotic behaviour near the Bπ threshold. However, it has the advantage that the value of the form factor at q 2 = 0 is among the fit parameters which can be seen from eq. 8: f i (q 2 = 0) = a i 0 . Also, in BSZ the kinematical constraints f + (q 2 = 0) = f 0 (q 2 = 0) directly leads to the relation a + 0 = a 0 0 between the coefficients. As is evident from eqs. 4 and 5, in the BCL parametrization the same kinematic constraint leads to a complex relationship between the expansion coefficients: Following this equation, we have replaced b 0 3 in terms of the other coefficients in the fit. This helps us reduce one parameter from the fit.
There is an important difference between the BCL and BSZ parametrization of f 0 (q 2 ) which is due to the presence of the pole factor in BSZ parametrization (eq. 8) that is absent in the respective expression in BCL (eq. 5). Note that the LCSR pseudo data points are obtained following the parametrization given in eq. 8 with the scalar resonance M B * = 5.54 GeV [15] which is slightly above the Bπ threshold. In ref. [16], the authors did not discuss about the parametrization for f 0 (q 2 ). In refs. [12,14], the functional form of f 0 does not include the pole following the argument that the scalar B * meson whose mass is expected to be above the Bπ threshold has not been observed experimentally. In order to be consistent with the literature, we have followed a functional form for f 0 similar to that given in eq. 5. In the result section, we will discuss the impact of this difference on the outcome of our analysis.

Comparison with existing literature
In the introduction, we have mentioned the exclusive determination of |V ub | from the available data on B → π ν decay rates [6][7][8][9]. The present way of combining all these measurements, followed by HFLAV, is the following: • First, a binned maximum-likelihood fit to determine the average partial branching fraction in each q 2 interval is performed. Though the bin-widths used by different measurements are different, the bin-edges are the same (a small difference is mentioned in the next paragraph) 2 .   The total branching fraction is then calculated from the sum of the partial branching fractions in the average q 2 spectrum, taking the correlations between q 2 bins into account. Also, they have treated the systematic and statistical uncertainties and the corresponding correlations separately. In addition, the shared sources of systematic uncertainty of all measurements are included in the likelihood as nuisance parameters, for details see [10].
• In the second stage, the average q 2 spectrum is then used to fit V ub and the relevant form-factor parameters (BCL [19] 3+1: 3 form factor coefficients b 0 , b 1 , b 2 and V ub .). To constrain the high q 2 behavior of the spectrum, they use the FLAG lattice average [20] of two LQCD calculations [12,13] and similarly, ref. [21] as LCSR inputs for constraining the low q 2 nature of the spectrum.
We have repeated the binned maximum-likelihood fit to obtain the average q 2 -spectrum. Our results, with all the available data, are shown in the third column of table I. The corresponding correlation matrix of the average q 2 -spectrum can be found in table XIII in appendix A. The results from HFLAV have been displayed in the second column of the same table. Note that the average spectrum we reproduce is consistent with that from HFLAV within 1σ. However, our fit quality is about 1% while that for HFLAV is about 6%. This difference in the quality of fit could be due to the non-availability of all the information on the shared systematic uncertainties between measurements (like continuum subtraction, tracking efficiency, etc.) in our fit, which HFLAV had utilised in their analysis. We have incorporated the systematic and statistical uncertainties as given in the published papers by Belle and BaBar.
In search of a possibility of improvement, one should carefully inspect all the data-sets, which we do in the following sections. A closer look at the data shows that BaBar(12) untagged analysis [8] of the B 0,+ modes have much better statistics/yield (almost double) than the one published in the previous year: BaBar(11) [6]. It also has the following advantages over the BaBar(11) analysis: • The event selection has been optimized over the entire fit region instead of the signal-enhanced region, as was done previously.
• The tighter selections produce a data-set with a better signal to background ratio and higher purity in the B 0 → π + ν decays.
• This analysis uses the full BaBar data-set compared to only a subset in the analysis of 2011.
BaBar(11) have presented their main results from a simultaneous analysis of four exclusive charmless semileptonic decay modes: B 0 → π + ν, B + → π 0 ν, B + → ρ 0 ν, and B 0 → ρ + ν. This method may reduce the cross-feed's sensitivity between B → π ν and B → ρ ν decay modes and some of the background contributions. In 2012, the analysis for B 0 → π + ν mode was done in 12-bins each of width ∆q 2 = 2 GeV 2 (except the last one) and for B + → π 0 ν mode in 11-bins. In contrast to this, the study in 2011 was done only in 6-bins each of width ∆q 2 = 4 GeV 2 (except the last one). Note that the analysis-method in BaBar(11) is considerably different from that of BaBar (12). It is also very different from all the analyses on B → π ν decays by Belle. Thus, as a first attempt to look for the possibility of improvement in the analysis, we drop the BaBar(11) data in 6-bins while extracting the average partial branching fraction in each q 2 interval from a binned maximum-likelihood fit to data. The result of the fit has been shown in the fourth column in table I. We notice an improvement in the fit quality from 1% to 24.8%. The average q 2 -spectrum generated in this scenario is consistent with HFLAV and our averages with all data. Figure  1 compares all the three average q 2 -spectrums as given in table I.
In the second stage, for the BCL fit of the average q 2 spectrum, we are able to more or less reproduce their result for |V ub | and the form-factor parameters (second and third column of table II), using tables 81 and 82 of ref. [10] 3 . If we use our own averaged q 2 spectrum (all data) instead, the fit-probability increases a lot, but the value of |V ub | decreases (i.e. goes away from the inclusive result) even more (fourth column of table II). However, from a fit to our q 2 average spectrum as obtained without BaBar(11), we get a similar value of |V ub | as HFLAV and the quality of fit is also good with a p-value of 64% (fifth and last column of table II). Note that we have also done a fit after adding the BaBar(11) data-set with this q 2 -averaged spectrum, the fit quality of which is not good (p-value 3%).
Other than HFLAV, the lattice group MILC [13] also combined these experimental results. Instead of finding an average q 2 spectrum, they fitted the experimental data directly, with or without lattice constraints. It was thus imperative for us to cross-check our analysis with ref. [13]. We also reproduced table XV of that paper, where just the experimental data from each measurement are fitted with a BCL parametrization truncated at n = 3. Table III compares the results of our fit with those obtained by MILC.

Scenario
Fit   It is clear from table III that our results are in complete agreement with those obtained by MILC within 1σ for all the cases, except for BaBar (12) where, with consistent parameter-spaces, our fit-probability appears to be much better than that of MILC. In order to further make sure of the validity of our results, we have compared this particular case with a fit to the BaBar(12) data alone that has been provided in table VI of the corresponding reference [8] using a BGL expansion for the form factors, the details of which can be obtained from [6] and the references therein. The comparisons are provided in table IV. It shows that we are in good agreement with the fit results from BaBar(12).

Use of the new LCSR inputs
As was mentioned in the introduction, updated inputs are available for the form-factors f + (q 2 ) and f 0 (q 2 ) for the B → π modes at values of q 2 other than zero [15]. They provide the values and covariance matrix for the form factors f + (q 2 ) and for f T (q 2 ) at q 2 = −15, −10, −5, 0, 5 and f 0 (q 2 ) at q 2 = −15, −10, −5, 5 GeV 2 -which are summarized in table V. The corresponding correlations are given in the appendix (table XIV). The value of f 0 at q 2 = 0 is obtained via the QCD relation f + (0) = f 0 (0). We utilize these inputs and repeat our second fit to extract |V ub |. This fit was performed using the average q 2 spectrum given in the fourth column in table I (without the inputs from BaBar (11)). In addition to the experimental data and LCSR inputs, we have used the lattice inputs from ref.s [12][13][14]. In order to further constrain the high q 2 behavior, instead of using the FLAG results, we use both of the 2015 lattice results (from the UKQCD [12] and the MILC [13] collaborations) individually 4 .
While UKQCD provides synthetic data points for f +,0 (q 2 ) with full covariance matrices (both systematic and statistical) at q 2 = 19, 22.6, 25.1 GeV 2 , MILC only provides the fit-results for their coefficients, using either just lattice or lattice and Experimental data. We use the results of the 'only lattice' fit and generate correlated synthetic data-points at exactly the same q 2 values as UKQCD, with an extra point for f + at q 2 = 20.5 GeV 2 , thus utilizing the full information from the lattice fit 5 .   We find that both the collaborations are in good agreement. Therefore, we use 22 data points (9 from LCSR, 13 from Lattice (3 for each of f +,0 from UKQCD, 4 for f + and 3 for f 0 from MILC ) from lattice and LCSR. The next obvious step would be to check how the decay distribution fits to the average q 2 spectrum from table I, if these new lattice and LCSR inputs are used. Using the average q 2 spectrum from the third and fourth columns of table I and the 22 data-points from lattice and LCSR, we perform two fits. The results are shown in the last two columns of table VI and it shows that in both the fits the extracted values of |V ub | have increased from that obtained in table II (fourth and fifth columns, respectively). Unlike the fit following HFLAV, here the p-value clearly increases if we use the avg. q 2 spectrum without BaBar(11) data. The only problem here is the abysmal fit quality of the avg. q 2 fit (third column of table I). As argued earlier, using the results of a fit with negligible significance leads to biased result. If, instead, we start from the avg. q 2 spectrum of the fourth column of table I (fit w/o BaBar(11); has a considerable fit-probability of 24.8%), and reintroduce the data from BaBar(11) during the second fit of decay-rate distribution, along with the new 22 data-points from lattice and LCSR, we end up with the second column of table VI. We can see that the fits find the same parameter space, but with meaningless significance (∼ 0.8%). This reinforces the fact that the data from BaBar(11) is quite at odds with all other data-sets.
Towards the goal of keeping every significant data, while successfully weeding out outliers, we perform a rigorous outlier analysis on the available data-set and discuss how that can improve these observations further. 4. B → π ν rates with |V ub | inc.
In case we neglect the possibility of any new physics effects in b → u ν decays then it is natural to expect that the extracted value of |V ub | from the inclusive and exclusive decays should be consistent with each other. At the moment, the extracted values differ from each other by about 2.2 σ. Possible new physics explanation of this observation is available in the literature [22]. However, in this analysis we will not consider the possibility of new physics in this decay. As we have discussed in the introduction, the extraction of |V ub | inc. is not very clean. Also, it is clear from the discussion above that the values extracted from exclusive B → π ν decays are very sensitive to a group of experimental inputs, and the fit with all of the data has a minuscule probability. However, we have noted an increase in the extracted value of |V ub | after the inclusion of the new LCSR inputs.
To understand the effect of the inconsistency in data on the decay rate distributions, we have derived the B → π ν   decay rate distributions using the form-factors extracted only from the LCSR and lattice inputs as discussed above. Armed with the 22 data points from lattice and LCSR we carry out two fits. The first one is with the BSZ (Bharucha, Straub, Zwicky) [17] expansion for the form factors. The advantage in using such a series is that the kinematic constraint f + (0) = f 0 (0) is manifest in the way the series is constructed and one need not take care of the same as an extra constraint while fitting. Note that we use the synthetic data provided by the authors wherever we can, in order to minimize the bias from parametrization while using the whole fit-information from the authors.
To find a stable and conservative estimate of the uncertainties of the form-factors, we truncate the series at different orders, starting from 0 to 4 for both f + and f 0 and carry out a model selection procedure incorporating both AIC and AIC c 6 . We conclude that the optimal description of the synthetic data is obtained when both f 0 and f + are truncated at N = 3. This amounts to seven parameters (4 for f + and 3 for f 0 ). We repeat the same fit with the BCL parametrization as well, with the extra unitarity-constraint applied. The extracted values of the coefficients in BSZ and BCL expansion are given in table VII. We find that with the BCL parametrization, the higher-order form factor parameters are more precise especially for f 0 .
To get the shape and height of the distribution, we have used the |V ub | inc. value. This will also help us pinpoint the reason for a discrepancy between the inclusive and exclusive determinations. Using the above fit results and |V ub | from different inclusive estimates, if we calculate the theoretical predictions of the various observables, then any large deviation of those predictions from the actual experimental measurements could potentially diagnose the source of the apparent tension between |V ub | inc and |V ub | exc .
In figure 2, we have compared the theoretical predictions for the binned differential branching fractions against the experimental information taking all the existing data into account. 7 . We observe that the q 2 distribution of the differential branching fraction generated in both the form-factor parametrizations can explain almost all the available data. Very few are lying entirely outside of the theoretical C.I. bands.
In the following section, our goal will be to find out which data-points are picked up as outliers after a rigorous statistical analysis of the exclusive data. Whichever appears both here and that analysis, should by default be responsible for the tension between the inclusive and exclusive estimates of |V ub |. Hence, after removing those datapoints, we will extract |V ub |.

II. OUR MAIN RESULTS
We think that instead of using the average q 2 spectrum, where the fit clearly is a bad one, we should directly use the individual data-points to do a simultaneous fit of |V ub | and the parameters corresponding to the chosen form-factor parametrization. This not only allows us to find the outliers, but also takes care of the ambiguity in inferring the results from a two stage fit, one of which is bad and the other good, and provides us with a single value for the fit probability to draw our inference from. We follow the same principle of using individual data-points while using the lattice inputs from different collaborations. Also, in this part of the analysis we have used all the available inputs from LCSR and lattice-QCD as discussed in paragraph I A 3.
To proceed further, we have created different data-sets out of the available experimental inputs from Belle and BaBar. The following are the list of of those data-sets: -Fit 3A: Experimental data (Fit 3) + synthetic Lattice data points,  -Fit 3B: Experimental data (Fit 3) + synthetic Lattice data points + LCSR.
Fit 1 contains all data-sets other than the inputs from the single-mode analysis of BaBar(12) [8], though we have considered the combined mode analysis from the same publication. Though the partial branching fractions from the analysis of both these modes are consistent with each other, we have also defined the scenario Fit 3 to check their differences. In Fit 3 we have included the single-mode-analysis-data of BaBar (12) and have dropped the combined one.
Lastly, in Fit 2, we have dropped the data from the analysis of combined modes from BaBar(11) [6] and BaBar(12) [8]. Basically, this data-set does not contain any input from BaBar(11). As mentioned earlier, this will help us understand the impact of BaBar(11) four-mode analysis data.
For all the fit scenarios we have extracted the respective 'pulls' between the data and the fitted distributions. For this work, 'pull' for the i th data-point will be defined as We have also calculated the Cook's Distances for each data-point for all fits 8 . None of the data-points in any of the fits have Cook's distances larger than the Cook-cutoff for that particular fit. This clearly demonstrates that because not a single data-point is tagged as an influential one, the observables with the largest 'pull's, would, quite safely, be considered as outliers.
In table VIII, we present the observables for which the pulls are greater than 2. Note that in all the scenarios, B(B 0 → π − ) [8,10] from Belle (13), and in all the scenarios involving BaBar(11), B(B 0 → π − ) [4,8] and B(B 0 → π − ) [20,26.4] from BaBar(11) have pulls greater than 2. All the fits can more or less comfortably accommodate the rest of the data.
In   • For the BSZ parametrization, the quality of fit improves when one includes LCSR, whereas for the BCL case, the fit worsens with the inclusion of LCSR. For Fit 2, the fit-probability is reasonably good. This is due to the absence of the BaBar 2011 data set. However, in all the scenarios, the fit quality increases by a considerable amount after dropping a few data-points with pull > 2.
• Whenever both Lattice and LCSR data are included, using the BCL form-factor-parametrization results in a slightly larger |V ub | than that obtained from BSZ, albeit with reduced fit-probability. The difference in the best fit values is about 1%, and the results are extremely consistent with each other. We will comment on this observation at a later stage.
• In all the fits, the extracted |V ub | increases by > ∼ 1% with the inclusion of the new LCSR inputs.
• We have presented our results truncating the BSZ and BCL expansions at N = 3 8 and N z = 4 4 respectively, resulting in 4 parameters for f + and 3 for f 0 (due to the kinematic constraint). To check the consistency of the results obtained, we have analysed the available inputs truncating the series at the next order in both the series expansions, i.e., N = 4 for BSZ and N z = 5 for BCL. We found that the extracted values of |V ub | are very much consistent (within their errors) with the one presented in table IX. In both types of expansion, we notice a tiny shift ∼ 0.5% in the best fit values of the extracted |V ub |. However, in these fits, the newly added higher-order coefficients of the expansions remain mostly unconstrained, and they have a negligible impact on the precision extraction of |V ub |. At the present level of precision, it is hard to constrain the higher-order coefficients.
• In the scenarios Fit 1 and Fit 3, the extracted values are almost the same, which is not surprising (as mentioned above) since the corresponding data sets are almost-equivalent as well.
• Irrespective of the fit scenario, the extracted |V ub | increases after dropping the data-points with pull > 2. This indicates that the data with large 'pull', i.e. those which have a tension with the other data points (as explained earlier, they indeed are the outliers), have an impact on the extracted values of |V ub | too. In Fit 1, the extracted |V ub | increases by ≈ 3% whereas the increase is > ∼ 2% in case of Fit 3. In Fit 2, the enhancement is less (about 1.5%), since we had already dropped the BaBar(11) data-set in this case.
• The extracted |V ub | in Fit 2B is consistent with the one obtained in table VI (fourth column). Hence, the fit results are consistent with the one obtained from our average q 2 -spectrum (without BaBar(11)) using a BCL parametrization plus the lattice and new LCSR inputs.
In figure 3, we have compared our extracted values in different fit scenarios as given in table IX with the inclusive determinations. Note that our determination for |V ub | exc. is still not consistent at 1 σ with the values extracted by HFLAV following GGOU and BLNP. However, they are consistent with the new Belle measurement of |V ub | inc. [26].
Following up the discussions in section I A 3, we perform an analysis where we check the deviations of the data with the predictions of corresponding differential decay rates obtained only from the lattice and LCSR inputs. We have used three different values of |V ub | inc. : Belle (New) [26], The deviation corresponding to the i th observable is defined as where σ SM i contains the uncertainties in the decay rates due to form-factor parameters and |V ub |. The results of this 'deviation'-analysis are shown in table X. As expected, when we use the values of |V ub | inc. from HFLAV, at least   four to five data-points have a deviation > 2. On the other hand, since the new Belle-measurement has a lower value than that of HFLAV, it is only the partial rates B(B 0 → π − ) [20,26.4] (BaBar(11)) and B(B 0 → π + ) [0.01,2] (Belle (13)) which have deviation > 2 (both in BSZ and BCL). However, B(B 0 → π + ) [0.01,2] (Belle(13)) has a rather minor effect on |V ub |. The partial decay rates B(B 0 → π − ) [20,26.4] (BaBar(11)), B(B 0 → π − ) [18,20] (Belle(11)) and B(B 0 → π − ) [8,10] (Belle (13)) are the common data points which have pull > 2 in both the analyses given in tables VIII and X, respectively. Based on these observations, we define a few additional scenarios: • Fit 2B-I: Input used in Fit 2B without the data on B(B 0 → π − ) [18,20] (Belle 2011).
The results in the above-mentioned fit scenarios are given in table XI. A comparison between identical cases, like Fit 2B and Fit 3B, in tables IX and XI shows that one can extract exactly similar values of |V ub | exc. even by dropping only one or two data-points as mentioned above. This means that even in the presence of other outliers, i.e. data-points which do not fit comfortably with other data, the most influential data-points in determining the estimate of |V ub | exc. are the partial branching fractions B(B 0 → π − ) [18,20] (Belle(11)) and B(B 0 → π − ) [20,26.4] (BaBar(11)).
Finally, we would like to comment on the slight shift (≈ 1%) in the best fit values of |V ub | in BSZ and BCL expansion of the form factors, though the obtained results are extremely consistent with each other. To understand the difference, we need to compare the form factors obtained from the two separate expansions. As an example, in figure 4 we have compared the form factors calculated from the fit results in 'Fit-2B'. Following are a few observations: • The slight difference in the extracted |V ub | is not due to the form factor f + (q 2 ), which provides the leading contributions to the decay rate. We can see from figure 4a that the extracted q 2 distributions of f + in both BSZ and BCL are completely consistent with each other. Also, both distributions satisfy all the lattice or LCSR pseudo data points.
• The difference in the extracted |V ub | betweent the two formalisms is due to the slight mismatch in the extracted values of f 0 (q 2 ) in a part of the q 2 region, which we can see from figure 4b. For 15 < ∼ q 2 < ∼ 23 (in GeV 2 ) the extracted values of f 0 (q 2 ) have a slight mismatch for the two different expansions. However, the extracted values are in good agreement with lattice or LCSR data points in both cases.
• We have checked that the observed difference in f 0 (q 2 ) is because the BSZ expansion uses the pole due to B * scalar meson while BCL expansion does not. We have discussed it in subsection I A 1. Our conclusion is based on the following observations: -In the fits, after dropping the LCSR data points at q 2 = 5, −5, −10 and −15 GeV 2 , the extracted values of |V ub | in both the expansions exactly match with each other. This indicates the role of the pole factor 1/(1 − q 2 /M 2 B * ) in the fit. -As a trial, we modified the BCL expansion of f 0 (z) in 5 after multiplying it by a similar pole factor as given in BSZ expansion and fitted the coefficients using the inputs given in 'Fit-2B'. Utilizing this fit result, we have extracted f 0 (q 2 ) and compared it with the BSZ one. The results are shown in figure 4b as BCL (modified), and we find absolute agreement between the two results. Also, the extracted values of |V ub | in both the fits become identical.

III. SUMMARY
We have extracted |V ub | analyzing all the available inputs on the exclusive B → πlν decays. This includes the data on the partial decay rates, inputs from lattice, and those from LCSR. In particular, we add the updated inputs on form-factors f + (q 2 ) and f 0 (q 2 ) at both zero and non-zero values of q 2 . We have pointed out some of the issues of the earlier fits done by HFLAV, which relied upon obtaining an average q 2 spectrum of the partial width generated from all the available data on the decay rates on B → π ν in the first stage. To extract |V ub |, HFLAV has used this average spectrum at the second stage. We have reproduced both these fits and arrived at a fit with very low probability for the average q 2 spectrum at the first stage, similar to HFLAV (ours is even worse). We have identified BaBar(11) data (at least a part of it) as a probable source of such a bad quality fit. The average q 2 spectrum of the decay rates without that data-set has an appreciable fit-probability. With this q 2 averaged spectrum, in the second stage, we have extracted |V ub | with and without the data from BaBar(11). The quality of fit is much better without the data from BaBar(11). We have then repeated the same analysis with the new inputs from LCSR and noticed an increase in the best-estimate of |V ub | by roughly about 6%. However, the quality of the second-stage-fit is reduced.
In search of a possibility of improvement, we simultaneously fit all the data (instead of a two-stage fit) after defining different fit scenarios. In the process, we have identified outliers, i.e. data-points inconsistent with the rest of them.
The goal is to check whether some of these outliers, if any, are also influential in the extraction of |V ub |. We found a very small number of data-points that compromise the fit-quality, and at the same time, influence the extraction of |V ub |. Our best results are the following: • Without the input from BaBar(11) (full data-set) and B(B 0 → π − ) [18,20] (Belle(11)), we obtain |V ub | = (3.95 +0.14 −0.15 ) × 10 −3 .