Reevaluation of the hadronic contribution to the muon magnetic anomaly using new e+e- ->pi+pi- cross section data from BABAR

Using recently published, high-precision pi+pi- cross section data by the BABAR experiment from the analysis of e+e- events with high-energy photon radiation in the initial state, we reevaluate the lowest order hadronic contribution a_mu[had,LO] to the anomalous magnetic moment of the muon. We employ newly developed software featuring improved data interpolation and averaging, more accurate error propagation and systematic validation. With the new data, the discrepancy between the e+e- and tau-based results for the dominant two-pion mode reduces from previously 2.4 sigma to 1.5 sigma in the dispersion integral, though significant local discrepancies in the spectra persist. We obtain for the e+e- based evaluation amu[had,LO] = (695.5 +- 4.1) 10^-10, where the error accounts for all sources. The full Standard Model prediction of a_mu differs from the experimental value by 3.2 sigma.


I. INTRODUCTION
The Standard Model (SM) prediction of the anomalous magnetic moment of the muon, a µ , is limited in precision by contributions from hadronic vacuum polarisation (VP) loops. These contributions can be conveniently separated into a dominant lowest order (a had,LO µ ) and higher order (a had,HO µ ) parts. The lowest order term can be calculated with a combination of experimental cross section data involving e + e − annihilation to hadrons, and perturbative QCD. These are used to evaluate an energysquared dispersion integral, ranging from the π 0 γ threshold to infinity. The integration kernel strongly emphasises the low-energy part of the spectrum. About 73% of the lowest order hadronic contribution is provided by the π + π − (γ) final state. 1 More importantly, 62% of its total quadratic error stems from the π + π − mode, stressing the need for ever more precise experimental data in this channel to confirm or not the observed deviation of 3.8σ between SM prediction and experiment [1].
A former lack of precision e + e − -annihilation data inspired the search for an alternative. It was found [2] in form of τ → ν τ + π − π 0 , 2π − π + π 0 , π − 3π 0 spectral functions [3][4][5][6], transferred from the charged to the neutral state using isospin symmetry. During the last decade, new measurements of the π + π − spectral function in e + e − annihilation with percent accuracy became available [7][8][9][10], superseding or complementing older and less precise data. With the increasing precision, which today is on a level with the τ data in that channel, systematic discrepancies in shape and normalisation of the spectral functions were observed between the two systems [11,12]. It was found that, when computing the hadronic VP contribution to the muon magnetic anomaly using the τ instead of the e + e − data for the 2π and 4π channels, the observed deviation with the experimental value [13] would reduce to less than 1σ [14].
The discrepancy with the e + e − data decreased after the inclusion of new τ data from the Belle experiment [15], published e + e − data from CMD2 [8] and KLOE [10] (superseding earlier data [16]), and a reevaluation of isospin-breaking corrections affecting the τ -based evaluation [1]. 2 In terms of a had,LO µ , the difference between the τ and e + e − -based evaluations in the dominant π + π − channel was found to be 11.7 ± 3.5 ee ± 3.5 τ +IB [1] (if not otherwise stated, this and all following a µ numbers are given in units of 10 −10 ), where KLOE exhibits the strongest discrepancy with the τ data (without the KLOE data the discrepancy reduces from 2.4σ to 1.9σ). Another quantity for comparison, which is more sensitive to the higher-energy π + π − spectrum, is the τ − → π − π 0 ν branching fraction showing a difference between measurement and e + e − prediction of (0.64±0.10 τ ±0.28 ee )% [1]. 3 Recently, the BABAR Collaboration has published [17] a π + π − (γ) spectral function measurement based on half a million selected e + e − → π + π − γ(γ) events, where the hard photon is dominantly radiated in the initial state (ISR). It benefits from a large cancellation of systematic effects in the ratio π + π − γ(γ) to µ + µ − γ(γ) employed for the measurement. In this letter, we present a reevaluation of the lowest order hadronic contribution to a µ including the new BABAR data. We deploy the new software package HVPTools [18], featuring a more accurate data interpolation, averaging and integration method, 2 The total size of the isospin-breaking correction to a had,LO µ has been estimated to (−16.1 ± 1.9) · 10 −10 , which is dominated by the short-distance contribution of (−12.2 ± 0.2) · 10 −10 [1]. 3 A total isospin-breaking correction of (+0.69 ± 0.22)% has been added to the e + e − prediction of the τ − → π − π 0 ν branching fraction [1]. II. e + e − →π + π − CROSS SECTION DATA The dispersion integral for the lowest order hadronic contribution reads where K(s) ∼ s −1 [19]. The contribution from the light u, d, s quark states is evaluated using exclusive experimental cross section data up to an energy of 1.8 GeV, where resonances dominate, and perturbative QCD to predict the quark continuum beyond that energy. In this work we only reevaluate the contributions from the e + e − → π + π − and π + π − 2π 0 channels. For all the others we refer to Refs. [11,12,14]. A large number of e + e − → π + π − cross section measurements are available. Older measurements stem from OLYA [20,21], TOF [22], CMD [20], DM1 [23] and DM2 [24]. 4 They are affected by an incomplete or undocumented application of radiative corrections. Equation (1) and the treatment of higher order hadronic contributions require initial state radiation as well as leptonic and hadronic VP contributions to be subtracted from the measured cross section data, while final state radiation should be included. Because of lack of documentation, the latter contribution of approximately 0.9% in the π + π − channel has been added to the data, accompanied by a 100% systematic error [11]. Initial state radiation and leptonic VP effects are corrected by all experiments, however hadronic VP effects are not. They are strongly energy dependent, and in average amount to approximately 0.6%. We apply this correction accompanied by a 50% systematic error [11]. These FSR and hadronic VP systematic errors are treated as fully correlated between all measurements of one experiment, and also among different experiments.
More recent precision data, where all required radiative corrections have been applied by the experiments, stem from the CMD2 [8] and SND [9] experiments at the VEPP-2M collider (Novosibirsk, Russia). They achieve comparable statistical errors, and energy-dependent systematic uncertainties down to 0.8% and 1.3%, respectively.
These measurements have been complemented by results from KLOE [10] at DAΦNE (Frascati, Italy) running at the φ resonance centre-of-mass energy. KLOE applied for the first time a hard-photon ISR technique to precisely determine the π + π − cross section between 0.592 and 0.975 GeV. The cross section data are obtained from a binned distribution, corrected for detector resolution and acceptance effects. The analysed data sample corresponds to 240 pb −1 integrated luminosity providing a 0.2% relative statistical error on the π + π − contribution to a had,LO µ . KLOE does not normalise the π + π − γ cross section to e + e − → µ + µ − γ so that the ISR radiator function must be taken from Monte Carlo simulation (cf. [26] and references therein). The systematic error assigned to this correction varies between 0.5% and 0.9% (closer to the φ peak). The total assigned systematic error lies between 0.8% and 1.2%.
In a recent publication [17] the BABAR Collaboration reported measurements of the processes e + e − → π + π − (γ), µ + µ − (γ) using the ISR method at 10.6 GeV centre-of-mass energy. The detection of the hard ISR photon allows BABAR to cover a large energy range from threshold up to 3 GeV for the two processes. The π + π − (γ) cross section is obtained from the π + π − γ(γ) to µ + µ − γ(γ) ratio, so that the ISR radiation function cancels, as well as additional ISR radiative effects. Since FSR photons are also detected, there is no additional uncertainty from radiative corrections at NLO level. Experimental systematic uncertainties are kept to 0.5% in the ρ peak region (0.6-0.9 GeV), increasing to 1% outside.

III. COMBINING CROSS SECTION DATA
The requirements for averaging and integrating cross section data are: (i) properly propagate all the uncertainties in the data to the final integral error, (ii) minimise biases, i.e., reproduce the true integral as closely as possible in average and measure the remaining systematic error, and (iii) optimise the integral error after averaging while respecting the two previous requirements. The first item practically requires the use of pseudo-Monte Carlo (MC) simulation, which needs to be a faithful representation of the measurement ensemble and to contain the full data treatment chain (interpolation, averaging, integration). The second item requires a flexible data interpolation method (the trapezoidal rule is not sufficient as shown below) and a realistic truth model used to test the accuracy of the integral computation with pseudo-MC experiments. Finally, the third item requires optimal data averaging taking into account all known correlations to minimise the spread in the integral measured from the pseudo-MC sample.
The combination and integration of the e + e − → π + π − cross section data is performed using the newly developed software package HVPTools [18]. 5 It transforms the bare cross section data and associated statistical and systematic covariance matrices into fine-grained energy bins, taking into account to our best knowledge the correlations within each experiment as well as between the experiments (such as uncertainties in radiative corrections). The covariance matrices are obtained by assuming common systematic error sources to be fully correlated. To these matrices are added statistical covariances, present for example in binned measurements as provided by KLOE, BABAR or the τ data, which are subject to bin-to-bin migration that has been unfolded by the experiments, thus introducing correlations.
The interpolation between adjacent measurements of a given experiment uses second order polynomials. This is an improvement with respect to the previously applied trapezoidal rule, corresponding to a linear interpolation, which leads to systematic biases in the integral (see below, and also the discussion in Sec. 8.2 and Fig. 12 of Ref. [11]). In the case of binned data, the interpolation function within a bin is renormalised to keep the integral in that bin invariant after the interpolation. This may lead to small discontinuities in the interpolation function across bin boundaries. The final interpolation function per experiment within its applicable energy domain is discretised into small (1 MeV) bins for the purpose of averaging and numerical integration.
The averaging of the interpolated measurements from different experiments contributing to a given energy bin is the most delicate step in the analysis chain. Correlations between measurements and experiments must be taken into account. Moreover, the experiments have different measurement densities or bin widths within a given energy interval and one must avoid that missing information in case of a lower measurement density is substituted by extrapolated information from the polynomial interpolation. To derive proper averaging weights given to each experiment, wider averaging regions 6 are defined to ensure that all locally available experiments contribute to the averaging region, and that in case of binned measurements (KLOE, BABAR, τ data) at least one full bin is contained in it. The averaging regions are used to compute weights for each experiment, which are applied in the bin-wise average of the original finely binned interpoauthors. The systematic errors are introduced component by component as an algebraic function of mass or as a numerical value for each data point (or bin). Systematic errors belonging to the same identifier (name) are taken to be fully correlated throughout all measurements affected. So far, HVPTools has been only employed for the numerical evaluation of the most important π + π − (and π + π − 2π 0 ) parts of the dispersion integral (1). 6 For example, when averaging two binned measurements with unequal bin widths, a useful averaging region would be defined by the experiment with the larger bin width, and the bins of the other experiments would be statistically merged before computing the averaging weights.
lation functions. 7 If the χ 2 value of a bin-wise average 8 exceeds the number of degrees of freedom (n dof ), the error in this averaged bin is rescaled by χ 2 /n dof to account for inconsistencies (cf. Fig. 1). Such inconsistencies frequently occur because most experiments are dominated by systematic uncertainties, which are difficult to estimate.
The consistent propagation of all errors into the evaluation of a had,LO µ is ensured by generating large samples of pseudo experiments, representing the full list of available measurements and taking into account all known correlations. For each generated set of pseudo measurements, the identical interpolation and averaging treatment leading to the computation of Eq. (1) as for real data is performed, hence resulting in a probability density distribution for a had,LO µ (π + π − ), the mean and RMS of which define the 1σ allowed interval (and which -by construction -has a proper pull behaviour). The procedure yielding the weights of the experiments can be optimised with respect to the resulting error on a had,LO µ .
We have tested the fidelity of the full analysis chain (polynomial interpolation, averaging, integration) by using as truth representation a Gounaris-Sakurai [28] vector-meson resonance model faithfully describing the π + π − data. The central values for each of the avail- 7 The averaging weights for each experiment are computed as follows: 1. pseudo-MC generation fluctuates the data points (or bins) along the original measurements taking into account all known correlations; the polynomial interpolation is redone for each generated pseudo MC; 2. the averaging regions are filled for each experiment and each pseudo-MC generation and interpolated with second order polynomials; 3. small (1 MeV) bins are filled for each experiment, in the energy intervals covered by that experiment, using the interpolation of the averaging regions; 4. in each small bin a correlation matrix between the experiments is computed from which the averaging weights are obtained.
1. pseudo-MC generation fluctuates the data points (or bins) along the original measurements taking into account all known correlations; the polynomial interpolation is redone for each generated MC; 2. for each generated pseudo-MC, small (1 MeV) bins are filled for each experiment, in the energy intervals covered by that experiment, using the polynomial interpolation; 3. the average and its error are computed in each small bin using the weights previously obtained; 4. the covariance matrix among the experiments is computed in each small bin; 5. χ 2 rescaling corrections are computed and applied for each bin. able measurements are shifted to agree with the Breit-Wigner model, leaving their statistical and systematic errors unchanged. The so created set of measurements is then analysed akin to the original data sets. The difference between true and estimated a had,LO µ values is a measure for the systematic uncertainty due to the data treatment. We find negligible bias below 0.1 (remember the 10 −10 unit), increasing to 0.5 (1.2 without the highdensity BABAR data) when using the trapezoidal rule for interpolation instead of second order polynomials.
The individual e + e − → π + π − cross section measurements (dots) and their average (shaded/green band) are plotted in Fig. 2. The error bars contain statistical and systematic errors. For better comparison we also plot in Fig. 3 the relative differences between BABAR, KLOE, CMD2, SND, and the average. Fair agreement is observed, though with a tendency to larger (smaller) cross sections above ∼0.8 GeV for BABAR (KLOE). These inconsistencies (among others) lead to the error rescaling shown versus √ s in Fig. 1. The left hand plot of Fig. 4 shows the weights versus √ s the different experiments carry in the average. BABAR and KLOE dominate over the entire energy range. Owing to the sharp radiator function, the available statistics for KLOE increases towards the φ mass, hence outperforming BABAR above ∼0.8 GeV. For example, at 0.9 GeV KLOE data have statistical errors of 0.5%, which is twice smaller than for BABAR (renormalising BABAR to the 2.75 times larger KLOE bins at that energy). Conversely, at 0.6 GeV the comparison reads 1.2% (KLOE) versus 0.5% (BABAR, again given in KLOE bins which are about 4.2 times larger than BABAR at that energy). The experiments labelled "other exp" in the figure correspond to older data with incomplete radiative corrections. Their weights are small throughout the entire energy domain. Figure 4 (right) shows versus √ s the combined e + e − → π + π − cross section multiplied by the kernel function K(s) occurring in the dispersion integral (1). The kernel strongly emphasises the low-energy spectrum. The dashed (red) curve belonging to the right axis in the plot gives the corresponding error contribution (diagonal errors only, statistical and systematic errors have been added in quadrature). The peaks are introduced by the error rescaling and indicate inconsistencies between the measurements. The uncertainty in the integral is dominated by the measurements below 0.8 GeV. [ππ] = 503.5 ± 3.5 tot , shows that the inclusion of the new BABAR data significantly increases the central value of the integral, without however providing a large error reduction. This is due to the incompatibility between mainly BABAR and KLOE, causing an increase of the combined error. In the energy interval between 0.63 and 0.958 GeV, the discrepancy between the a had,LO µ [ππ] evaluations from KLOE and BABAR amounts to 2.0σ. BABAR is the only experiment covering the entire energy region between 2m π and 1.8 GeV.

A compilation of results for
Using only the BABAR data to evaluate a had,LO µ [ππ] one finds [17] 514.1 ± 2.2 stat ± 3.1 syst .
Also given in Table I is the combined τ -based result from Ref. [1]. The difference between the τ and e + e −based evaluations of a had,LO µ [ππ] now reads 6.8±3.5 τ +IB ± 2.9 ee , thus reducing to 1.5σ compared to 2.4σ without BABAR [1] (the BABAR-only result is in excellent agreement with the τ data). 9 A comparison between the combined e + e − and τ two-pion cross sections relative to the e + e − result is shown in Fig. 5. Significant local discrepancies arise in particular above the ρ peak.
We also reevaluate the e + e − → π + π − 2π 0 contribution to a had,LO µ . The CMD2 data used previously [31] have been superseded by modified or more recent, but yet unpublished data [32], recovering agreement with the published SND cross sections [33]. Since the new data are unavailable, we discard the obsolete CMD2 data from the π + π − 2π 0 average, finding a had,LO   (1) for the combined e + e − data obtained by multiplying the π + π − cross section by the kernel function K(s) (solid line). The dashed (red) curve belonging to the right axis shows the corresponding error contribution, where statistical and systematic errors have been added in quadrature. Note that the information conveyed by this curve is incomplete because only diagonal errors are shown, disregarding correlations between the cross section measurements which have significant influence on the integral error. [ππ] contributions from the e + e − data for different energy intervals and experiments. Where two errors are given, the first is statistical and the second systematic. Also given is the τ -based result from Ref. [1] combining all available τ data. The combined error has been rescaled to account for the inconsistency between the two evaluations.  Relative comparison between the combined τ (dark shaded) and e + e − spectral functions (light shaded), normalised to the e + e − result. The apparently oscillating structure around 0.5 GeV is due to two Belle measurements fluctuating to large cross section values. Clearly visible is the interference due to ρ-φ mixing around 1 GeV, which is not included in the isospin-breaking corrections applied to the τ data. It is also visible in the upper, and lower right hand plots of Fig. 2. The deviation between 0.8 and 0.95 GeV is due to the discrepancy between τ and KLOE data, which dominate in this region (cf. Fig. 4 left). Comparing the τ data with the combined e + e − data instead of a fit to a single experiment CMD-2 limited to 1 GeV as it was done for Fig. 4 in Ref. [29] and Fig. 28 in Ref. [30], we observe a reduced discrepancy, in particular between 1.0GeV and 1.4GeV. We therefore disagree with the conclusion reached in these references, where the difference goes up to a factor 4, and is even in the opposite direction with respect to the one we observe. when including the obsolete CMD2 data). The corresponding cross section measurements and HVPTools average are shown in Fig. 6.
Adding to the e + e − -based a had,LO µ [ππ] and a had,LO µ [ππ2π 0 ] results the remaining exclusive multihadron channels as well as perturbative QCD [14], we find for the complete lowest order hadronic term It is noticeable that the error from the π + π − channel now equals the one from all other contributions to a had,LO µ . Adding further the contributions from higher order hadronic loops, −9.79 ± 0.08 exp ± 0.03 rad [34], hadronic light-by-light scattering (LBLS), 10.5 ± 2.6 [35], as well as QED, 11 658 471.809 ± 0.015 [36] (see also [37] [37,40], by 25.5±8.0 (3.2σ). A compilation of recent SM predictions for a µ compared with the experimental result is given in Fig. 7. The BABAR results are not yet contained in evaluations preceding the present one. The result by HMNT [34] contains older KLOE data [16], which have been superseded by more recent results [10], leading to a slightly larger value for a had,LO µ .

V. CONCLUSIONS
We have reevaluated the lowest order hadronic contribution to the muon magnetic anomaly in the dominant π + π − channel, using new precision data published by the BABAR Collaboration. After combination with the other e + e − data a 1.5σ difference with the τ data remains for the dominant π + π − contribution. For the full e + e − -based Standard Model prediction, including also a reevaluated π + π − 2π 0 contribution, we find a deviation of 3.2σ from experiment (reduced from 3.7σ without BABAR). The deviation reduces to 2.9σ when excluding KLOE data, and further decreases to 2.4σ when using only the BABAR data in the π + π − channel. As a reminder, the τ -based result deviates by 1.9σ from the Standard Model.
The present situation for the evaluation of a had,LO µ [ππ] is improved compared to that of recent years, as more input data from quite different experimental facilities and conditions have become available: e + e − energy scan, e + e − ISR from low and high energies, τ decays. Our attitude has been to combine all the data and include in the uncertainty the effects from differences in the spectra. At the moment the ideal accuracy cannot be reached as a consequence of the existing discrepancies due to uncorrected or unaccounted systematic effects in the data. A critical look must be given to the different analyses in order to identify their weak points and to improve on them or to assign larger systematic errors.
It is thereby not sufficient to concentrate on improving the π + π − channel alone. Problems also persist in the π + π − 2π 0 mode, where the τ and e + e − -based evaluations differ by (3.8±2.2)·10 −10 , but also the e + e − data among themselves exhibit discrepancies. Fortunately, new precision data from BABAR should soon help to clarify the situation in that channel.