Violations of Quark-Hadron Duality in Low-Energy Determinations of $\alpha_s$

Using the spectral functions measured in $\tau$ decays, we investigate the actual numerical impact of duality violations on the extraction of the strong coupling. These effects are tiny in the standard $\alpha_s(m_\tau^2)$ determinations from integrated distributions of the hadronic spectrum with pinched weights, or from the total $\tau$ hadronic width. The pinched-weight factors suppress very efficiently the violations of duality, making their numerical effects negligible in comparison with the larger perturbative uncertainties. However, combined fits of $\alpha_s$ and duality-violation parameters, performed with non-protected weights, are subject to large systematic errors associated with the assumed modelling of duality-violation effects. These uncertainties have not been taken into account in the published analyses, based on specific models of quark-hadron duality.


Introduction
Confinement implies a dual description of QCD observables. First-principles theoretical calculations are made in terms of the fundamental quark and gluon degrees of freedom appearing in the Lagrangian, while experimental measurements rely on the detected hadronic spectrum. Both descriptions should agree, provided confinement is exact, but there is always some degree of ambiguity at the observable level, which introduces unavoidable theoretical uncertainties. In order to perform precise tests of the perturbative QCD predictions, one usually studies inclusive or semi-inclusive observables. The inclusive production of hadrons in processes that do not contain strongly-interacting particles in the initial state is particularly well suited for this purpose [1]. Since the total probability that quarks and gluons hadronize is just one and the separate identities of the produced hadrons are not specified, the two dual descriptions are indeed equivalent in this case. Nevertheless, the different infrared sensitivity of both approaches still generates some ambiguities. Even at very high energies, perturbation theory predicts the appearance of multiple thresholds, corresponding to the production of additional gluons and quark-antiquark pairs, while nature only exhibits multihadron production. The infrared problems associated with the binding of quarks and gluons in physical colour-singlet particles can be minimized, smearing the observable cross sections over a suitable energy range, which washes out the threshold sensitivity [2]. Similarly, in jet physics, one tries to minimize the sensitivity to the final hadronization. A clean jet observable should be infrared safe, i.e., free of collinear and soft singularities [3][4][5]. Fully inclusive observables such as the hadronic decay widths of the Z, W ± and H bosons are defined at a specific energy point given by the boson mass and, therefore, are subject to the threshold ambiguity. They are precisely known in perturbation theory, including all possible gluon emissions up to O(α 4 s ), but it remains an uncontrolled uncertainty associated with the nearby thresholds for multi-hadron production. Fortunately, the numerical size of this effect is strongly suppressed by the heavy boson mass because Λ QCD /M Z ∼ 2 × 10 −3 .
A similar argument can be applied to σ(e + e − → hadrons) at very high energies. However, at low and intermediate values of s the resonance structure of the hadronic spectrum shows up. In order to smear the violations of duality, one then considers integrals of the hadronic invariant-mass distribution over the full energy range, from threshold up to a given value s max , high enough that perturbative methods are reliable. These finite-energy sum rules are double-inclusive observables and, using the operator product expansion (OPE) [6][7][8][9][10], can be computed with a much higher precision than the production cross section at fixed values of the hadronic invariant mass [11][12][13][14][15]. The experimental determination of the distribution of the final hadrons in e + e − annihilation has been considerably improved in recent years [16][17][18][19][20][21][22], with the goal to refine the dispersive Standard Model prediction of the muon anomalous magnetic moment [23] and the running of the electromagnetic coupling up to M Z . Thus, there exists an interesting data set which could be used to perform precision QCD tests. Unfortunately, the achievable accuracy is still limited by significant discrepancies among different experiments which are not yet fully resolved.
A very special role has been played by the inclusive τ hadronic width [24][25][26][27][28][29], which provides a very clean observable from both the experimental and theoretical points of view [30]. The tau mass is high enough to safely apply the OPE, non-perturbative corrections can be shown to be suppressed [27] and the perturbative contribution, which is known to O(α 4 s ) [31], is very sizeable (dominant) because α s (m 2 τ ) is large. Furthermore, violations of quark-hadron duality are heavily suppressed because this inclusive observable is given by an integral over the full hadronic invariant-mass distribution that, moreover, it is weighted by a kinematic factor with a double zero at the upper end of the integration range [27,29]. The small size of non-perturbative effects can be assessed through the study of additional weighted integrals of this spectral distribution [29]. The detailed experimental analyses performed by the ALEPH [32][33][34], CLEO [35] and OPAL [36] collaborations corroborated a long time ago the predicted suppression of non-perturbative contributions and established a quite precise determination of α s (m 2 τ ), which has been later updated with the O(α 4 s ) QCD corrections and improved experimental information [37][38][39][40].
A quite different strategy has been advocated in Refs. [41][42][43][44][45]. Instead of suppressing the unwanted violations of quark-hadron duality, these references analyse observables that are very sensitive to such effects with the aim of measuring the size of the duality violations. Analyses of this type could help to better understand the complicated infrared dynamics responsible for the observed differences between the low-energy hadronic world and its partonic description. However, what it is actually done is a rough phenomenological estimate of the duality-violation (DV) contribution to the chosen observable, which is then subtracted from its measured value in order to determine α s , assuming that perturbation theory gives a good description of the reminder. While this procedure is obviously not more precise than the actual theoretical control we have over the subtracted DV contribution, a surprisingly accurate determination of α s has been claimed. It was already demonstrated in Refs. [40,46,47] that the numerical value of the strong coupling obtained in this way is model dependent because it is fully correlated with the adopted functional form of the DV correction. Small changes on the assumed DV ansatz result in large variations of α s , which gets then converted into one additional model parameter.
Some arguments concerning the applicability of the OPE at the τ mass scale and the theoretically-admissible functional form of the DV ansatz have been put forward [48][49][50], trying to evade the conclusions of Refs. [40,46,47]. In this work we provide a much more detailed analysis that exhibits the intrinsic inconsistency of these arguments. We aim to clarify the subject by making as transparent as possible the implicit assumptions of the DV approach to the strong coupling. The numerical correlation between the fitted value of α s and the assumed DV ansatz can be easily understood. The DV algorithm turns out to determine α s at a quite low energy scale,ŝ 0 ∼ (1.2 GeV) 2 , from a theoretically subtracted integral of the τ decay distribution up toŝ 0 . The τ data in the energy bins abovê s 0 must be used to fit the ansatz parameters and calculate the DV subtraction, but the resulting value of this subtraction changes in a quite significant way with slight modifications of the DV ansatz, generating an uncontrolled systematic uncertainty on α s . Once the strong coupling and the DV parameters get fixed with a given ansatz, all perturbative and DV deformations introduced by the chosen model can only be reabsorbed into the power corrections. An incorrect value of α s needs to be compensated with unphysical values of the vacuum condensates (as many as observables) in order to reproduce the experimental moments of the τ hadronic distribution. As a result, the spread of α s values enforces a much larger spread of fitted OPE corrections and a significant loss of theoretical control, which in some cases can even induce pathological behaviours not required by any data. Those DV ansatzs that do not display such pathologies turn out to generate condensates of smaller size and values of α s (m 2 τ ) in agreement with the standard determination with pinched weights [40].
Violations of quark-hadron duality are interesting phenomena per se [51][52][53][54][55][56][57][58][59][60][61][62][63][64], so it is worthwhile to investigate their effects through quantitative tests. In the absence of a better understanding of confinement, achieving a rigorous description of DV corrections is a very difficult (may be hopeless) enterprise, but nevertheless, it is important to assess their phenomenological impact in low-energy determinations of the strong coupling. This is in fact the main motivation of the analysis that will be presented next, which attempts to provide a quantitative estimate of the uncertainties associated with DV effects.
The manuscript is organised as follows. In section 2 we briefly review the wellknown analyticity properties of current correlators that make possible to rigorously analyse weighted integrals of the measured hadronic distributions with the short-distance OPE. The main results of the exhaustive analysis of α s (m 2 τ ) determinations, performed in Ref. [40] with a broad variety of methodologies, are summarized in section 3, which collects different pieces of phenomenological evidence that will be used in the subsequent discussion. Section 4 anatomizes the DV method employed in [41][42][43][44][45], clarifying its assumptions and the adopted computational algorithm, and reproduces the numerical results of Ref. [45]. The sensitivity of this approach to the assumed functional form of the DV ansatz is studied in detail, exhibiting the very large (unaccounted) systematic uncertainties associated with our poor control of DV phenomena. In addition, this section discusses the applicability region of the inverse power expansion and points out the formal inconsistencies implicit in recent arguments against the truncation of the OPE, showing that those criticisms are inherently flawed. All these results are then used in section 5 to quantitative assess the actual impact of DV effects in the more standard determinations of the strong coupling presented in section 3. The estimated DV corrections are in this case well below the perturbative and non-perturbative uncertainties already considered in [40], demonstrating the robustness of the final extraction of α s (m 2 τ ). Some summarizing comments are finally given in section 6 that concludes giving our estimated value of α s (m 2 τ ) from the available τ data. We relegate to the appendix some complementary results, which are not crucial for the central discussion but expose the tautological nature of several tests within the DV approach.

Theoretical formalism
For the inclusive observables we are interested in, the QCD dynamics is encoded in the two-point correlation functions where J = V, A are the colour-singlet vector V µ ij =q j γ µ q i or axial-vector A µ ij =q j γ µ γ 5 q i quark currents (i, j = u, d, s . . .), and the superscripts denote the corresponding angular momentum J = 1 and J = 0 in the hadronic rest frame ( q = 0). For values of s ≡ q 2 ≤ m 2 τ , the spectral functions (absorptive parts) of these correlators are directly measured by the invariant-mass distribution of the final hadrons in τ decay [27]: where s th is the hadronic mass-squared threshold, us,A (s) (2.3) and the global factor S EW = 1.0201 ± 0.0003 accounts for the (renormalization-group improved) electroweak radiative corrections [65][66][67].
We will restrict our discussion to the Cabibbo-allowed hadronic distribution. Neglecting the tiny up and down quark masses, 1 s Π (0) ud,J (s) = 0. Therefore, the relevant dynamical quantities are the scalar correlators These correlators are analytic functions in the whole complex plane, except along the positive real s axis where their imaginary parts have discontinuities. Using a closed complex contour circumventing the physical cut, one gets the following mathematical identity for any weighted integral of the hadronic spectral functions ρ J (s) ≡ 1 π Im Π J (s) [27,29,68]: with ω(s) an arbitrary weight function without singularities in the region |s| ≤ s 0 . The integral on the left-hand-side is directly determined by the experimental data, while for sufficiently large s 0 values the OPE can be used to calculate the contour integral along the circle |s| = s 0 , as an expansion in inverse powers of s 0 . The observable R τ corresponds to the particular weight ω τ (x) = (1 − x) 2 (1 + 2x) = 1−3x 2 +2x 3 , with x ≡ s/s 0 and s 0 = m 2 τ that is expected to be large enough to safely apply the OPE. Neglecting the logarithmic running of the Wilson coefficients, Cauchy's theorem implies that the contour integral is only sensitive to OPE corrections with dimensions D = 6 and 8, which are strongly suppressed by the corresponding powers of m τ . In the total V +A distribution, there is in addition a strong cancellation between the vector and axial-vector power corrections, which have opposite signs [27,39,40]. The QCD contribution to R τ is dominated by the perturbative correction, which amounts to a large 20% effect because α s (m 2 τ ) ∼ 0.3 is sizeable. This explains the high sensitivity of this observable to the strong coupling.
In order to better analyze the different OPE contributions, it is convenient to particularize 2 Eq. (2.5) with a monomial weight ω n (x) = (s/s 0 ) n . Integrating by parts, From the first line, it follows that, for any n ≥ 0, The dominant perturbative D = 0 contribution to the different integrals is encoded in the associated Adler function, which is known up to four loops: where K 0 = K 1 = 1, while for n f = 3 quark flavours K 2 = 1.63982, K 3 = 6.37101 and K 4 = 49.0757 (MS scheme) [31]. One easily finds: Thus, the perturbative spectral function itself can be rewritten as an integral over complex angles. For the more inclusive moments A (n) pert (s 0 ), the integrand is zero at ϕ = −π, π, which are the (dangerous) angular values associated with the physical axis.
The perturbative integrals in Eqs. (2.11) and (2.12) can be computed in two different ways. One can either perform the contour integrations with a running coupling α s (−s), by solving numerically the five-loop β-function equation (contour-improved perturbation theory, CIPT) [28,69], or naively expand them in powers of α s (s 0 ) (fixed-order perturbation theory, FOPT). The CIPT prescription makes a re-summation of large higher-order corrections, generated by the long running of α s along the complex circle, which results in a slightly smaller perturbative contribution to A (n) J (s 0 ) than FOPT, for a given value of α s (m 2 τ ). Therefore, when solving the equality (2.5), CIPT leads to a slightly larger fitted value of α s . Strong efforts are currently being made aimed to improve our understanding of the perturbative series, e.g. see [70][71][72][73][74][75][76][77][78][79][80][81][82].
Weighting the spectral distribution with different functional dependences on s, one becomes sensitive to different power corrections in the OPE [27,29,68]. At LO in α s , the power correction O D,J is independent on the energy. QCD loops, which a priori cannot be ignored, spoil this behaviour and, at NLO, one has The factors P D,J determine the QCD running of the coefficients O D,J (µ). Their values cannot however be inferred from the O D,J (µ) evaluated at a single scale (in general they involve different nonperturbative vacuum matrix elements), but are suppressed with respect to O D,J (µ) by a power of α s . At this order, up to tiny light-quark mass corrections, one has (e.g. see [83]) O 2,J (µ) = P 2,J = P 4,J = 0 . (2.14) Performing the needed integrals, one finds The OPE is valid in the complex plane, away from the physical cut, which justifies its application in the contour integration except for the region near s 0 , the point where the circle touches the real axis. The so-called duality violations originate precisely from this small integration range where the OPE description is not precise. Fortunately, the R τ weight contains a double zero at the upper end of the integration range that strongly suppresses the numerical contribution from this dangerous region and, therefore, the corresponding violations of quark-hadron duality.
A quantitative definition of duality violations is provided by the differences between the physical values of the integrals A ω J (s 0 ) and their OPE approximations. Using again the analyticity properties of the correlators Π J (s), the size of these effects can be expressed in the form [41,52,59,84] ∆A ω the difference between the physical spectral function and its OPE expression. For largeenough values of s, the OPE provides the correct average value of ρ J (s), while missing the hadronic resonance structures that generate oscillations around this mean value. These differences decrease very fast when s increases, so that one may expect ∆ρ DV J (s) ∼ e −γs asymptotically. Therefore, the DV correction on the right-hand-side of Eq. (2.17) is completely dominated by the region of s values just slightly above s 0 . In fact, the relatively large oscillations of the spectral function at s 0 m 2 τ have a very minor numerical role in the integrals A ω J (s 0 ). Additionally, as it is well-known in the QCD literature [27,29,60,[85][86][87][88], taking weight functions that vanish at s 0 (pinched weights), one is then further minimizing the numerical impact of the unwanted DV effects.
3 Different strategies to obtain α s (m 2 τ ) From the measured invariant-mass distribution of the final hadrons in τ decays, Ref. [39] extracted the spectral functions ρ J (s) shown in Fig. 1. Together with the experimental data points, the figure displays the naive parton-model expectations (horizontal green lines) and the predictions of (massless) perturbative QCD for α s (m 2 τ ) = 0.329 (blue lines). Resonance structures are clearly visible at low s values, especially the prominent ρ(2π) and a 1 (3π) resonance peaks, but as the invariant-mass increases they are soon diluted by the opening of high-multiplicity hadronic thresholds, leading to much smoother inclusive distributions, as expected from quark-hadron duality considerations [2]. The flattening of the spectral function is remarkably fast for the most inclusive V + A channel, where perturbative QCD seems to work even at quite low values of s ∼ 1.2 GeV 2 .
An exhaustive re-analysis of the α s (m 2 τ ) determination was performed in Ref. [40]. The aim was to carefully assess all significant sources of non-perturbative systematic uncertainty. Table 1 summarizes the most reliable determinations, obtained with the total V + A spectral function. Compatible results, although with larger uncertainties, can be extracted from the separate V and A distributions. The different rows in the table correspond to different choices of pinched weights, with very different sensitivities to non-perturbative effects: (m 2 τ ) from τ decay data, in the V + A channel [40].
In this section we summarize the key points.

ALEPH-like sets of weights
The theoretical framework described in section 2 implies that the weighted integrals A (n) J (s 0 ) depend on a large number of unknown parameters: If these parameters were allowed to take arbitrary values, without any physics justification, one could fit any given set of A (n) J (s 0 ) inputs, independently of whether they correspond to actual measurements or are just fake data. As in any power expansion, the series need to be truncated in order to have predictive power, and this entails some theoretical notion about the natural size of their coefficients.
Given the relatively good behaviour of the perturbative Adler series, we take a very conservative range K 5 = 275 ± 400 for the unknown fifth-order coefficient, and assume that higher-order corrections are encapsulated by this variation. Since the known fifth-order coefficient of the QCD β function has already a negligible numerical impact on the results, we can safely disregard the unknown contributions from β m≥6 . In order to estimate the perturbative uncertainty, we supplement the K 5 variation with the residual dependence on the renormalization scale within the interval µ 2 /(−s) ∈ (0.5, 2).
On the other hand, it is obvious that there is an energy regime where the most relevant power correction comes from the operator of lowest dimension, irrespectively from whether it enters suppressed or not by short-distance QCD loops. The corresponding truncated prescription would correspond to keeping just the lowest-dimension contribution, disregarding whether it involves O D,J , P D,J or both.
In the ALEPH-like fits one assumes that power corrections are small enough so that only the lowest-dimensional condensates O D,J can have some impact on the observables at s 0 = m 2 τ . Thus, one neglects all P D factors and only the O D,J contributions with dimension smaller than D cut are taken into account. The original ALEPH fit adopts the truncation prescription D cut = 10, i.e., the higher-dimensional corrections from O D≥10,J are neglected. Additionally, one assumes that DVs are negligible for double-pinched weight functions at the τ mass scale. In general, this is expected to be a safe assumption. The sizable fluctuations of the spectral functions observed in Figure 1, which are expected to go to zero exponentially at large values of s, already have a negligible numerical role for those integrated moments in a rather large s 0 interval.
The first row in Eq. (3.1) shows the five weights employed in the ALEPH analysis. Although the resulting fit quality is good, there is some arbitrariness in this specific choice of weights and in the adopted truncation. Therefore, one must test the stability of the results under variations of the weight factors and analyze the uncertainties associated with the truncation of the OPE. The impact on α s from neglected condensates of higher dimensions has been estimated including O 10,J in the fit and taking the difference as an additional uncertainty. As far as experimental errors do not increase too much, and barring accidental (or artificial) fine-tuning, the size of the α s variation gives a good estimator of the systematic uncertainty due to truncation. This leads to the determination of α s (m 2 τ ) shown in the first row of Table 1 for both perturbative prescriptions, FOPT and CIPT. The values obtained with the two prescriptions have been finally combined, adding quadratically half their difference as an additional systematic uncertainty.
The second and third rows in Table 1 show the results obtained with the two alternative sets of weightsω kl and ω (2,m) , defined in the second and third rows of Eq. (3.1). Apart from leading to further not redundant self-consistence tests for α s , each set of weights brings a different asset. The former eliminates the kinematic (1 + 2x) factor of the ALEPH weights, nullifying any possible contribution of O 16,J and slightly reducing the potential impact of DVs. The second removes the contribution from D = 4. The three sets of weights give fits of excellent quality in the more inclusive V + A channel. The fitted values for the power corrections are always small and the α s determination is very stable (see Table 1). The very same value of the strong coupling is obtained from different combinations of weights, with very different sensitivities to the vacuum condensates.

Complementary tests
The observation made in the previous paragraph led us to make further tests in Ref. [40]. The role of power corrections appears to be rather marginal at s 0 ∼ m 2 τ . This suggests that perturbation theory alone, i.e., Eq. (3.2) with all power corrections neglected, may give a good description of the data, so that similar α s values would be obtained from different weights. Table 2 shows the fitted values for α s (m 2 τ ) obtained from a single moment, neglecting all non-perturbative contributions. The twelve different results correspond to twelve different choices of weights: While these numbers cannot be used in the final determination of the strong coupling, they do provide a useful assessment of the neglected corrections because each weight has a different sensitivity to these effects. The table exhibits an amazing stability of the results, which in all cases are well within the error ranges of our determinations in Table 1, suggesting that the missing non-perturbative contributions are most likely small.   Table 2. Values of the strong coupling extracted from a single A ω V +A (s 0 ) moment with weights ω (1,m) (x) or ω (2,m) (x), 0 ≤ m ≤ 5, at s 0 = 2.8 GeV 2 and neglecting all non-perturbative corrections [40].  [40]. Figure 3 compares two experimental moments, for the vector, axial-vector and 1 2 (V +A) distributions, with their perturbative predictions, ignoring all non-perturbative contributions. Perturbation theory gives an identical prediction for the three distributions; its variation within the range α s (m 2 τ ) = 0.329 + 0.020 − 0.018 , in FOPT and CIPT, is indicated by the coloured bands. The left plot corresponds to the weight ω (0,0) (x) = 1, i.e., a direct integration of the measured spectral function without any weight. This moment does not receive any leading-order OPE power correction, but it is more exposed to violations of quark-hadron duality. The experimental curves show indeed a beautiful signal of duality violations: a clear oscillation of the V and A curves in opposite directions that cancels to a rather large extent in the total V + A moment. The V + A curve exhibits a surprisingly smooth behaviour, remaining within the 1σ CIPT band even at low values of s 0 ∼ 1 GeV. The V , A and 1 2 (V + A) experimental moments nicely join above 2.5 GeV 2 , so that one can no-longer identify any duality-violation signal.
The right plot in Figure 3 corresponds to the weight ω (2,0) (x) = (1 − x) 2 . It clearly shows that the double-pinch factor has eliminated the visible signal of duality violations. Wiggles are no-longer present in any of the three curves. At the same time, it exhibits the presence of a clear (D = 6) power correction, with opposite signs in the V and A moments, which matches the behaviour expected from the OPE. However, this correction seems to be tiny at s 0 ∼ m 2 τ because the V , A and 1 2 (V + A) experimental curves join above 2.2 GeV 2 and, moreover, remain within the 1σ perturbative bands. In the higher energy bins, the numerical size of DVs and power corrections gets then masked by the much larger perturbative uncertainties.  2 (V + A) (red) channels. The left plot corresponds to the weight ω (0,0) (x) = 1, and the right one to ω (2,0) (x) = (1−x) 2 . The orange and light-blue regions are the CIPT and FOPT perturbative predictions for α s (m 2 τ ) = 0.329 + 0.020 − 0.018 [40]. The blue horizontal lines at the bottom indicate the parton-model prediction.

Determinations based on the s 0 dependence
Fitting the s 0 dependence of a single A (2,m) (s 0 ) moment, above someŝ 0 ≥ 2.0 GeV 2 , one can also extract the values of α s (m 2 τ ), O 2(m+2),J and O 2(m+3),J . The sensitivity to power corrections is poor, as expected, but one finds a surprising stability in the extracted values of α s (m 2 τ ) at differentŝ 0 . The fourth line of Table 1 combines the information from three different moments (m = 0, 1, 2), adding as an additional theoretical error the fluctuations with the number of fitted bins. Notice that this determination of the strong coupling is much more sensitive to violations of quark-hadron duality because the s 0 dependence of consecutive bins feels the local structure of the spectral function. The agreement with the other determinations shown in the table confirms the small size of duality violations in the V + A distribution aboveŝ 0 . 3 Weights with an extra exponential suppression e −ax , with a > 0, are also interesting for determining α s . As shown in Eq. (2.17), they clearly reduce DVs. Moreover, for small values of a 0.5, their induced power corrections are suppressed by a numerical factor a D/2 (D/2)! and, therefore, are not going to be larger than the previously neglected P D,J contributions, leading in principle to a free gain with respect to the a = 0 case. 4 Taking into account that power corrections appeared to have a marginal role at s 0 ∼ m 2 τ , in Ref. [40] we opted for taking the weights ω (1,m) a , defined in the last row of Eq. (3.1). They provide a completely different sensitivity to non-perturbative corrections because their exponential suppression nullifies the higher s region, strongly reducing the violations of quark-hadron duality, at the price of being more exposed to OPE contributions of arbitrary dimensionality. Performing a pure perturbative analysis, the neglected power corrections should manifest as large instabilities of α s under variations of s 0 and a = 0; however, stable results are found for a broad range of values of s 0 and a, which again indicates small power corrections. The last line in Table 1 combines the information extracted from seven different moments with 0 ≤ m ≤ 6.
The excellent agreement among all determinations shown in Table 1, obtained with a broad variety of approaches that have very different sensitivities to non-perturbative corrections, demonstrates the small numerical impact of these contributions.

Duality-violation approach to the strong coupling
We have now all the ingredients needed to analyze the strategy advocated in Refs. [41][42][43][44][45] and assess its advantages and weaknesses. The basic quantity being investigated is the integral constructed with the simplest weight factor ω 0 (s) ≡ ω (0,0) (s) = 1. Since there is no weight, the leading power corrections O D,J do not contribute to this particular moment. However, it does receive contributions from the subleading P D,J terms in the OPE and, moreover, it is not protected against duality violations. If one neglects all P D,J contributions, the moment A ω 0 J (s 0 ) only depends on the strong coupling and the DV correction. A sensible approach would be going to the most inclusive channel, V + A, and use the fact that, practically in the whole range where perturbation 3 Instead of fitting all energy points at the same time and inflate uncertainties based on the fluctuations in αs, we could have opted for taking a set of points with larger energy separation, removing to some extent the sensitivity to those DV fluctuations. However, the result would be essentially equivalent, since then we would have eventually averaged over the arbitrary selection of energy points, using finally the same amount of experimental information. 4 In practice the further suppression of P4,J suggests taking prefactors that nullify O4,J , such as ω(x) = theory makes sense, the local DV fluctuations have a very subleading role in the integral, as explicitly shown by the red data points in Fig. 3. The left panel in this figure exhibits indeed a very smooth s 0 dependence of the moment A ω 0 V +A (s 0 ). Since the fluctuations are expected to go to zero very fast when s 0 increases, their size in a large-enough interval can give a conservative assessment of DVs. Essentially this corresponds to the determination given before in the fourth row of Table 1 that nicely agrees with the determinations based on pinched weights, which are also shown in the table.
One may further insist in analyzing the separate V and A channels where sizeable oscillations are visible in Fig. 3. In that case, given the flattening of the purple and green curves in the higher energy bins, it would still be possible to assume that near the τ mass, before experimental uncertainties become too large, DVs are small. Remarkably, it is obvious from the figure that one would obtain a value of the strong coupling very close to the V + A one. However, the robustness of this isolated assumption is weaker.
In order to have a better control on DV effects, one may assume a functional form for the spectral function, to be fitted from data, and use Eq. (2.17) to measure the size of the duality-violation contribution to A ω 0 J (s 0 ). Let us then assume that the functional ansatz provides a reasonable description of ∆ρ DV J (s) above some invariant massŝ 0 . The combination of an oscillatory function with a damping exponential is assumed to describe the fall-off of duality violations at very high energies [54]. The ansatz adopted in [41][42][43][44][45] corresponds to the choice This four-parameter functional form is theoretically well motivated, but it cannot be derived from first principles and nobody really knows above which value ofŝ 0 it could start to be a good approximation. We have added the global factor G J (s) in order to assess later the stability of the results under slight variations of the assumed parametrization. With all these assumptions, the DV ansatz parameters and the strong coupling can be extracted from a fit to the s 0 dependence of the experimental moment A ω 0 J (s 0 ) exp . The algorithmic procedure involves essentially the following simple steps: 1. The ansatz parameters are fitted, bin by bin, to the s 0 dependence of τ . This is mathematically equivalent to a direct fit of ρ J (s) (the derivative of the integral of the spectral function), 5 as demonstrated in Eq. (2.9) and appendix A.2.2. 3. Since the strong coupling is largely insensitive to the local spectral function and the small correlation of A ω 0 J (ŝ 0 ) and ρ J (s) at s >ŝ 0 plays a very marginal role, α s is mostly extracted from , assuming that this difference is well described by perturbative QCD at the scaleŝ 0 .
The dangers of this prescription are quite obvious. Since one needs to employ enough energy bins to fit the ansatz parameters, the strong coupling is finally fixed at a very low scaleŝ 0 ∼ (1.2 GeV) 2 where theoretical errors are large and perturbative QCD is suspect. Moreover, the subtracted duality-violation contribution is a rather sizeable integral that has been estimated in a model-dependent way, without any study of its possible variation under reasonable modifications of the ansatz. Thus, A ω 0 J (ŝ 0 ) pert is determined from a difference of two numbers and its resulting precision is limited by the actual theoretical control on ∆A ω 0 J (ŝ 0 ). Once the ansatz parameters and α s (ŝ 0 ) have been determined, one can use different weights to determine the OPE vacuum condensates from the moments A (n) J (ŝ 0 ) exp . This is the only additional information available because the whole experimental data aboveŝ 0 has been already used in the previous fit. Again, the duality-violation corrections ∆A

Truncated versus non-truncated OPE
A rather surprising argument, based on rejecting the truncation of the OPE, has been put forward in Refs. [48][49][50], aiming to criticize the more standard determination of α s discussed in section 3 and to advocate the alternative use of a specific model of duality violations in non-protected moments with the algorithmic DV procedure described above.
The starting point consists in assuming a too small value for the strong coupling, obtained from a very unstable DV fit to A ω 0 V (ŝ 0 ) with the default ansatz in Eqs. (4.2) and (4.3). A too small (or too large) strong coupling leads to very poor perturbative predictions for all the moments, when directly comparing them with data. An ad-hoc way of curing it, without correcting the input value of α s , is by adding as many arbitrary model parameters as observables.
In moments with pinched weight functions, duality violations are found to be very suppressed, independently of their exact shape. Therefore, ∆A ω J (ŝ 0 ) cannot compensate an incorrect value assumed for α s . The proposed solution advocated in Refs. [48][49][50] consists then in keeping all (an infinite number) higher-dimension O D,J coefficients, arguing that they are very large (divergent series), while at the same time all P D,J corrections are neglected. Clearly, the first statement is incompatible with the second approximation.
In any phenomenological application of the OPE, one needs to assume from the very beginning that there is an energy regime where the inverse-power expansion makes sense. Otherwise, the theoretical OPE description would be meaningless, irrespective of whether one truncates or not the series. For a given dimension D, neglecting the P D,J correction with respect to the corresponding O D,J contribution is reasonable because P D,J carries an additional α s suppression, so that typically |P D,J | ∼ 0.2 |O D,J |. However, this suppression is largely compensated by the much stronger power suppression of the higher-dimension condensate contributions. The validity of the OPE requires that where η k ∼ 0.2 −1/(2k) ≈ 1 (η 4 ∼ 1.22, η 6 ∼ 1.14, η 8 ∼ 1.11, η 10 ∼ 1.08, . . . ). Obviously, the assumption |P D,J | |O D+2k,J |/s k 0 does not make any sense. In fact one of the many assumptions made in Ref. [45] to obtain their advocated values of the strong coupling and the corresponding condensates consists in neglecting all P D,J contributions not at s 0 = m 2 τ , but at the much lower scaleŝ 0 < 1 2 m 2 τ , while arguing that the O D≥10 corrections are too large to be neglected at s 0 = m 2 τ . The impact of neglecting the former with respect to the latter scales as 0.2 · 2 D/2 . A more explicit calculation shows that numerical pre-factors slightly damp this effect, but not nearly enough. Using Eq. (2.15), the neglected P D,J contributions to the moments with representative weights ω 0 = 1 and ω τ are:   Table 3.
Independently of any consideration about duality violations, the large values claimed for the condensates make the whole procedure inconsistent. Let us note that, as shown in section 3, data do not indicate any signal of large condensate corrections at s 0 ∼ m 2 τ . On the contrary, all tests performed there exhibit a smooth dependence on s 0 , suggesting a very well-behaved OPE even at lower energy values around s 0 ∼ 1.5 GeV 2 . Moment P 6,V +A P 8,V +A P 10,V +A P 12,V +A P 14,V +A P 16,V +A 20% partonic  Table 3. Estimated size of the neglected |P D,V +A | power corrections atŝ 0 = 1.55 GeV 2 , for the hypothetical divergent condensates advocated in Refs. [45,48], compared with the size of the perturbative contribution used to extract α s .

Modelling duality violations with the default ansatz
In order to assess the actual uncertainties of the DV procedure described before, we will first perform several fits with the default ansatz of Ref. [45], using the same ALEPH data [39]. Afterwards, we will investigate the robustness of these results by exploring how much they can change with small modifications of the assumed ansatz. For simplicity, we adopt here the FOPT prescription when evaluating the perturbative series. Very similar conclusions can be obtained with CIPT.
Fitting the A ω 0 V (s 0 ) exp moment with the ansatz in Eqs. (4.2) and (4.3), one finds the results displayed in Table 4. Although the strong coupling is basically determined by the difference A ω 0 V (ŝ 0 ) exp − ∆A ω 0 V (ŝ 0 ), atŝ 0 = 1.55 GeV 2 , its numerical value has been evolved to the usual reference scale m 2 τ .
α Table 4. Once α s and the ansatz parameters have been fixed, one can easily extract corresponding values for the OPE vacuum condensates from the experimental moments A (n) V (ŝ 0 ) exp , subtracting first the estimated DV contribution ∆A (n) V (ŝ 0 ). 6 The fitted central values are given in Table 5. This table shows also an approximate estimate of the corresponding axial condensates, which is good enough for our test purposes. At the chosen default valuê s 0 = 1.55 GeV 2 , no competitive information about α s is obtained in the axial channel with the DV approach, even accepting all the assumptions (see Fig. 4). In a combined fit, α s is then going to be fixed by the vector channel. Thus, we have speeded up the numerical algorithm by taking the central value of α s obtained in the vector channel to then fit the axial parameters. This suffices to obtain the corresponding axial condensates following the same procedure as for the vector one. The values obtained for both the DV parameters and the condensates are a good approximation of the corresponding ones given in Refs. [45,48].
In order to have a better feeling on the numerical role of the DV corrections, we give in Table 6 Table 5.  Table 6. Separate values of the OPE and DV contributions to A ωn V (s 0 ), together with the fitted experimental moments, for the relevant weights ω n (x) = x n and 1 − x n , using the default ansatz. The splitting is shown at the scalesŝ 0 = 1.55 GeV 2 and s 0 = 2.8 GeV 2 . the vector distribution with weights x n and 1 − x n . The table shows also the corresponding contributions at s 0 = 2.8 GeV 2 , using the same fitted parameters. Within this model set-up, the DV contributions are found to be very suppressed when using pinched weight functions, as expected. Rather strikingly, even for the more unprotected x n weights, the predicted DV effects turn out to be only at the level of experimental uncertainties at s 0 = 2.8 GeV 2 .
Instead of fitting the default ansatz with the assumed default value ofŝ 0 , we can change the latter, since its default choice is a priori not justified. This exercise, already done in Ref. [40] for the vector channel, is displayed both for the vector and axial distributions in Fig. 4. The left panels show, as a function ofŝ 0 , the value of α s (m 2 τ ) extracted from a fit to all s 0 bins with s 0 ≥ŝ 0 , in the vector (top) and axial (bottom) channels. The right panels display the associated p-values of the different fits, which indicate a rather low statistical quality and a strongŝ 0 dependence, specially in the vector case where the largest p-value is around 5% only. 7 The default choiceŝ 0 = 1.55 GeV 2 corresponds to the lowest value of α s . This ad-hoc choice was adopted in Ref. [45] with the argument that it has the largest p-value, but it is difficult to justify from the behaviour observed in the figure. Applying the same somewhat arbitrary criteria in the axial channel, this is, finding a local maximum in the p-value (which is larger in the axial channel) with not-too-large uncertainties, one  One may argue that it makes more sense to pick the solution atŝ 0 = 1.55 GeV 2 , away from the ρ and a 1 peaks, as more likely, but looking at the fit results, we do not really have a very strong justification to prefer the former solution over the latter, as assumed in Ref. [45] without noting the possible axial solution.

Sensitivity to the assumed ansatz
In Ref. [40] we already showed that there is a strong dependence of the fitted results with the assumed functional form of the ansatz. Inserting in Eq. (4.2) a multiplicative factor G J (s) = s λ J (GeV 2 units) and repeating the fit to the vector distribution with the samê s 0 = 1.55 GeV 2 and different values of the power λ V in the interval λ V ∈ [0, 8], we observed a very significant correlation between the input value of λ V and the fitted result for α s . Moreover, the outcomes from these fits show a pattern that can be summarized through the following properties: 1. The fit quality, as measured by the p-value, increases with the power λ V .
2. The fitted value of α s increases when the fit quality (λ V ) increases, approaching the results in Table 1. 3. All models reproduce well ρ V (s) in the fitted region. However, the default value λ V = 0 implies a spectral function that strongly deviates from data at s <ŝ 0 . As λ V increases, the ansatz slightly approaches the data below the fitted region.
4. The size of the DV correction ∆A ω 0 V (ŝ 0 ) decreases as λ V increases.

5.
The fitted values of the vacuum condensates decrease in a very significant way when the fit quality (λ V ) increases.
For completeness, we compile the details of this analysis in appendix B. The chosen functional form G J (s) = s λ J is of course completely ad-hoc, as it was the original default choice G J (s) = 1, but it demonstrates that the fitted results strongly depend on the assumed spectral-function model and, therefore, are unreliable. The orthodox DV practitioners could still argue [48] that the power dependence s λ J does not seem to comply with their expectations for the asymptotic behaviour of the spectral function at very large values of the hadronic invariant mass [64]. However, the fitted region of s 0 values is not really asymptotic. One could use instead a functional form G J (s) = 1 − a J /s, which incorporates the expected leading inverse-power correction at s → ∞, with very similar results. For any set of fitted parameters {λ V , α s , δ V , γ V , α V , β V } one can easily find an alternative set {a V , α s , δ V , γ V , α V , β V } that provides an equally good fit to the spectral function in the fitted region and exhibits the same strong correlation between the fitted value of the strong coupling and the remaining ad-hoc parameters.
A quick scan of possible ansatz variations reveals many possible solutions with very different behaviours. Let us just pick the following four illustrative examples (in GeV 2 units): The first one gives the fit with the highest p-value among the s λ V models analyzed in Ref. [40]. The second and third examples correspond to ansatz modifications giving, respectively, particularly larger and smaller values of the strong coupling, while having a functional form that satisfies the asymptotic behaviour assumed in Ref. [64]. The last example exhibits another possible solution, fully allowed by data, with the default choice G V (s) = 1 but selecting a higher value forŝ 0 . While at largerŝ 0 one may not be able to give a unique precise solution for the other ansatz parameters, this fact does not make  Table 7. Fitted values of the spectral function ansatz parameters and α them any less likely, as shown by giving the corresponding p-value. Let us note that if one assumed the default V solution in Eq. (4.8) to be the correct one, this would actually be the case for the axial channel, where one would need to rule out the apparent local solution atŝ A 0 = 1.30 GeV 2 and argue that the physical one must be at largerŝ A 0 , where data does not shed much light about the corresponding value of α s .
Let us start by giving in Table 7 the central values of the fitted DV parameters for the four different ansatz modifications, together with the reference values found before with the default choice. The four selected examples result in higher p-values than the default fit. These model variations imply changes of about ±10% in the fitted value of α s , with respect to the default set-up, which may serve as an approximate assessment of the uncertainty inherent to a specific model choice. The corresponding predictions for the vector spectral function are compared with the experimental data in Fig. 5, which exhibits their completely different behaviour outside the fitted region. This explains the sizeable splitting of their associated α s determinations. One can also observe that below the assumedŝ 0 points the convergence of the data to the DV models is actually worse than the convergence of the data to the OPE itself at the alternative reference point s 0 = m 2 τ . The corresponding vacuum condensates, calculated in the same way as before, 8 are given in Tables 8 and 9, for the vector and axial channels, respectively. From the tables, we observe how minor modifications in the ansatz can change the condensate values by several orders of magnitude. This instability is very easy to understand because, once α s and the ansatz parameters get fixed, the vacuum condensates are enforced to re-absorb all the perturbative (through a modified α s ) and DV deformations introduced by the different models, in order to approach the data. Tables 10 and 11 show the separate OPE and DV contributions to the moments with weight functions ω N (x) = x N andω N (x) = 1 − x N , respectively, for the four different ansatz variations and the default set-up, in the vector channel. The splitting of the two contributions is given both atŝ 0 and at s 0 = 2.8 GeV 2 . Atŝ 0 , the relative size of the DV corrections strongly depends on the assumed ansatz parametrization, specially for the unprotected x N weights, which gets translated into the large variations observed before in  the fitted values of the strong coupling and vacuum condensates. However, in all models, one observes again that pinched weights indeed suppress DVs at s 0 ∼ m 2 τ , and that one pinch (i.e. one single zero at s 0 ) is enough to put them below the experimental uncertainties in most cases.

Assessing the size of DV uncertainties in the V + A channel
Any determination of the strong coupling is affected by systematic uncertainties, originating in those effects which are not yet under full theoretical control, such as continuous extrapolation (in discretized computations), truncation of perturbation theory and/or the OPE, hadronization, duality violations, etc. They need to be estimated in a proper way, trying to avoid both naive underestimates and pessimistic overestimates.
Given a deviated strong coupling value as input, one can test how much one would need to inflate the initially assigned systematic errors to accommodate such a deviation. If a systematic uncertainty is well-estimated one should expect that the inflation leads to improbable scenarios, such as effective parameters acquiring values orders of magnitude off or crazy bumps in otherwise expected smooth functions. Otherwise the suggested inflation may be justified.
In our case, we can take the values of the strong coupling, α FOPT s (m 2 τ ) = (0.26 − 0.32), emerging from the different (vector) DV scenarios discussed in the previous section, together with their corresponding modelling of the spectral function, and check whether those values which deviate from the ones given in Table 1 lead indeed to solutions that do not make much sense.
The V + A spectral functions predicted by the different ansatz variations are compared with the data in Fig. 6. A discouraging feature for the use of all these models becomes evident. Their convergence to the data below the assumed pointŝ 0 is actually much worse than the convergence of the OPE itself around the reference value m 2 τ (and actually at any point within the plot region). Since the lack of an exact convergence of the OPE to the data was the original motivation to introduce DV corrections, one may wonder whether the poor behaviour exhibited by the assumed ansatzs justifies at all this modelling of duality violations. The same caveat can be observed with the extrapolation of the DV ansatzs at higher values of the hadronic invariant mass. In fact, both the default model and the variation 3 imply a rather implausible shape, with local DVs above m 2 τ considerably larger than even the one corresponding to the peak of the first axial resonance a 1 (1260). Taking into account the behaviour of the experimental spectral function in practically all the measured energy range and the large number of hadronic channels already opened at this energy, the additional bumps/dips predicted by these two models look rather unlikely. Looking back into Table 7, we realize that these are precisely the two models leading to too low values of the strong coupling.
This unphysical behaviour becomes more evident when we display the corresponding values of ∆A ω 0 V +A (s 0 ). This is done in Fig. 7, where the predicted DV contributions of the different models are compared with the corresponding "experimental" shapes of these quantities, i.e., with different behaviour is observed outside this region. To better visualize the implied patterns, we have ordered the different panels attending to the corresponding deviation of the strong coupling from the results given in section 3 (from larger to smaller deviation). The more α s deviates from the quoted uncertainties in Table 1, the more absurd is the shape displayed by the function ∆A ω 0 V +A (s 0 ). Obviously, the two Heaviside-like scenarios at the top of the figure (variation 3 and default ansatz) are very unlikely. They would imply a huge DV at m 2 τ , not required by any experimental fact, that needs to fall down abruptly to zero in order to be consistent with asymptotic freedom. We find natural to leave them outside the quoted uncertainties without any need of guessing what is the exact shape of the spectral function, which is beyond theoretical control.
However, a Heaviside-like convergence of ∆A ω 0 V +A (s 0 ) would not be enough to take α s outside our determination, since this kind of DV behaviour becomes very suppressed when using pinched weight functions. In this case, a deviated value of α s can only be compensated with huge fine-tuned nonperturbative condensates, as shown in Table 12 for two representative double pinched moments with the five ansatz set-ups. Neglecting the P J corrections, these A ω (2,n) V +A (s 0 ) moments 9 only receive OPE contributions from O 2(n+2),V +A and O 2(n+3),V +A . Both the default ansatz and the variation 3 need to incorporate huge OPE corrections at s 0 = m 2 τ in order to restore agreement with the data. As we argued above, this OPE scenario has no regime of validity or physical meaning and should be discarded. Indeed, for these two ansatzs the corresponding fine-tuning becomes totally unreliable if we go to the lower scaleŝ 0 , where the OPE was assumed to be valid. This is demonstrated in Table 13, which exhibits a completely crazy behaviour with individual OPE contributions much larger than the total perturbative correction. The OPE does not 9 Let us note that A ω (2,1) V +A (s0) is the moment associated to the tau decay width, Rτ .  Table 1. make any sense in these two scenarios. The other three DV ansatzs (variations 1, 2 and 4) do not exhibit any of these pathologies. They show an acceptable s 0 behaviour in Fig. 7, falling down smoothly at large values of the hadronic invariant mass, as expected. Moreover, their corresponding OPE contributions in Tables 12 and 13 have a reasonable size, consistent with the implicit assumption of negligible P J corrections near the τ mass. Not surprisingly, the values of α s implied by these three scenarios are in excellent agreement with our more solid determinations with pinched weights presented in section 3. In the three cases, α s lies within our estimated 1σ interval, showing that systematic uncertainties were indeed correctly assessed in Ref. [40].

Summary
We have addressed in a quantitative way the role of violations of quark-hadron duality in low-energy determinations of the strong coupling. This type of effects are unavoidably present in any hadronic observable, preventing an exact (infinite accuracy) theoretical description. Assuming confinement, inclusive observables provide the best possible playground to make precision physics in QCD, using the powerful OPE techniques. However, even there, small DV corrections show up, owing to the different threshold behaviour (multi-hadron versus multi-parton) of the two dual descriptions of the QCD spectrum.
In the absence of a rigorous understanding of confinement, one usually tries to minimize the DV contributions in order to achieve the best possible phenomenological accuracy. This can be done by working at large-enough energies and/or by smearing the observable cross sections over a suitable energy range. This second approach is compulsory at low and intermediate energies, where precise QCD predictions can only be made for integrated moments of the spectral hadronic distributions.
The vector and axial-vector spectral functions extracted from the invariant-mass distribution of the final hadrons in τ decays have made possible to perform accurate determinations of the strong coupling with a N 3 LO accuracy. We have reviewed the present status in sections 2 and 3, where the reasons why a high sensitivity to α s is obtained have been explained in detail. An important aspect of this nowadays classical determination is the strong suppression of DV contributions in spectral moments with pinched weights. This was actually one of the very first considerations made in the pioneering papers suggesting to extract α s from the observable R τ and related pinched moments [26,27,29].
In recent years, a different strategy has been suggested, advocating to model the oscillations observed in the experimental spectral functions with phenomenological ansatzs, and use them to quantify the DV corrections to non-protected (not pinched) moments. These ansatzs are elegantly motivated, but one should keep in mind that they correspond to particular hadronization models. An obvious question then arises, concerning how much the value of the strong coupling obtained with this procedure depends on the specific functional form assumed for the adopted ansatz. The clarification of this important question has been the main motivation of this work.
An exhaustive analysis of the DV-ansatz approach to α s has been presented in section 4. We have anatomized the employed algorithm, in order to make its implicit assumptions as transparent as possible. This has allowed us to show that all experimental data in the interval (ŝ 0 , m 2 τ ] are actually used to fit the parameters modelling the spectral function, while the wanted QCD information is mostly obtained from moments computed at the lowest energy pointŝ 0 . After subtracting the corresponding DV corrections, the strong coupling is finally extracted from A ω 0 J (ŝ 0 ) and the moments A (n) J (ŝ 0 ) provide the power corrections. This introduces two evident caveats: 1) the subtracted DV contributions are model dependent, since they have been computed with the particular ansatz that has been assumed, and 2) atŝ 0 ∼ (1.2 GeV) 2 the perturbative uncertainties are quite large and the unknown power corrections are unavoidably enhanced.
Using different functional forms for the hadronic ansatz, we have exhibited the very large sensitivity of the fitted value of α s on the assumed parametrization. To simplify the discussion, we have focused on four particular model variations, all of them having a better statistical quality (p-value) than the default model originally adopted in Ref. [45]. As shown in Table 7, these model variations imply changes of up to ±10% in the fitted value of α s . 10 Within any given model, the power corrections need to be adjusted to compensate a slightly incorrect value of α s (plus the model-dependent DV contributions), in order to reproduce the corresponding experimental moment A (n) J (ŝ 0 ). Therefore, the spread of α s values enforces a much larger spread of fitted OPE corrections, shown in Tables 8 and 9, indicating a dangerous loss of theoretical control. The strong correlation between the assumed functional form of the hadronic ansatz and the fitted values of α s and the different condensates shows that the resulting parameters constitute at best an effective model description with unclear relation to QCD. Figures 5 and 6 provide some enlightenment on what is actually happening with these fits. Although all analyzed models describe well the hadronic spectral function in the fitted region (ŝ 0 , m 2 τ ], they exhibit a quite different behaviour outside it. Belowŝ 0 the models deviate dramatically from data. Above m 2 τ , those models giving too low values of α s (and huge condensates) generate very large oscillations that seem unphysical. This is better appreciated in the V + A distribution, given in Fig. 6, where one can observe the implausible shape of these bumps/dips with local oscillations above m 2 τ that are larger in amplitude than the a 1 (1260) resonance. As explicitly shown in Fig. 7, these models generate a quite pathological s 0 dependence of the DV contribution ∆A ω 0 V +A (s 0 ), since this correction needs to be very large at s 0 ∼ m 2 τ to accommodate the associated α s value and, at the same time, must fall down very abrutly to match the expected asymptotic behaviour. The DV ansatzs that do not exhibit these pathologies turn out to generate fitted condensates of more reasonable size and values of α s in excellent agreement with the standard determination in Table 1.
The gained understanding on violations of quark-hadron duality has allowed us to go one step further and analyze the possible impact of this type of corrections in the standard determination of α s (m 2 τ ) with pinched moments. Tables 11 and 12 compare the predicted size of the DV contributions to different pinched moments with the corresponding OPE contributions and with the experimental values, around the scale m 2 τ . At this scale the estimated DV contributions turn out to be tiny in all cases, being much smaller than the OPE uncertainties, and always remaining below the experimental errors (except for the pathological variation 3). For the default ansatz assumed in Ref. [45], the obtained DV effects are completely negligible.
Taking all this into account, we conclude that systematic uncertainties have been correctly assessed in the standard determination of α s (m 2 τ ) reviewed in section 3. The dominant errors originate in perturbation theory itself. Therefore, at present, the τ decay data imply the values of the strong coupling summarized in Table 1

A Relations between observables and equivalent fits
In this appendix, we give trivial relations between observables, aiming to expose some redundancies in previous analyses with the aim of sorting the amount of meaningful information that one can extract from different fits.

A.1 Some generic fit properties
In a fit we typically have a set of n experimental points p i (i = 1, · · · n) with an associated covariance matrix V ij . For every p i we have a theoretical prediction t i (θ j ) that depends on m parameters θ j . Essentially, the result of a fit is supposed to give us the θ j values that best match the predictions t i to the measurements p i . The fit method should be invariant under linear transformations of the data points. For any invertible known matrix A, the fit should give the same θ j independently on whether we fit (p i , t i ) or (p i ,t i ) ≡ A ij (p j , t j ), whose associated covariance matrix isṼ = AV A T . This is trivially realized with the χ 2 (θ j ) function, Notice that the χ 2 function is ill-defined when the covariance matrix is singular. This can occur when the same data pointp i has been introduced twice. Since adding many times the same data point to a fit cannot change the fit result, the solution is straightforward: remove the redundancies.
Another condition that any fit should satisfy is the fact that adding a new data point (t n+1 , p n+1 ) dependent on an extra unknown parameter θ m+1 does not give us any information on the previous θ i or in the agreement of the theory with data. Indeed, when minimizing the χ 2 , the new parameter θ m+1 will simply adapt its value to exactly match t n+1 with p n+1 , leaving the χ 2 unmodified. 11

A.2 Explicit redundancies in several approximations
When dealing with experimental distributions, such as the ones from ALEPH [39], we work with a discrete spectrum. For a set of consecutive energy-squared values s i , which are the central energy points of bins with width∆ i , ending ats i ≡ s i +∆ i 2 , we have the measured spectral function ρ i ≡ ρ(s i ). Correlations among the different data points can be large and need to be taken into account.
The associated discrete integrals of Eq. (2.7) are simply given by where ∆ i ≡∆ i /s j . If we stick to a single energy points j , we can fit A (n) (s j ) for several monomial functions or combinations of them.

A.2.1 ALEPH-like fits
In the ALEPH-like fits one assumes that power corrections are small enough so that only the lowest-dimensional condensates O D have any impact on the observables. Thus, one neglects both the lower-dimensional P The small values obtained for the condensates give some illuminating information: the deviations from the purely perturbative predictions are very small for all moments. Nonetheless, if we are only interested in the value of α s , we can take into account the previous discussion in subsection A.1 and isolate the two independent linear combinations that, with our truncation choice, only depend on α s : . By construction, the fitted value of α s will be exactly the same as in the full fit, since the three additional points in (A.4) depend on three completely unknown parameters and, as remarked before, adding as many free parameters as data points does not give us any information about the fit quality or the previous parameter, α s . Analogously, including O 10 in the fit is equivalent to only using ω 1 for the α s determination. Taking the difference of both results, which is related to the quality of the fit, is a good assessment of the neglected higher-order power corrections, which appear enhanced by large prefactors for the relevant weight functions.
The corresponding weights for the fit without the kinematic factor (1 + 2x) arê Finally, for the A (2,m) moments, one has the analogous combinations While they are not uncorrelated, it is rather clear that these tests are not redundant, as one can explicitly check by observing the larger instabilities of the fits in the separate V and A channels [40].

A.2.2 Fit to the s 0 dependence
On the other hand, if we stick to a monomial function and make a fit to the s 0 dependence of the moment A (n) (s 0 ), it corresponds in the discrete version to fitting As explicitly discussed before, the fit must be invariant under linear transformations of the data points. It is then trivial that the previous fit is necessarily equivalent to a fit to Fitting the s 0 -dependence of A (n) (s 0 ) is exactly the same as fitting A (n) (s 0 ) in the initial point, plus the spectral function: one can trivially reproduce one set of data points from the other without any theoretical input. Note how the continuum version of this equivalency is simply given by Eq. (2.9). The slope of the moments is given by the spectral function. It is at this stage where we can demonstrate that the five different fits to the s 0 dependence of the vector channel made in Refs. [45,50] trivially reduce to Eq. (A.13) for n = 0, not giving any new information on α s or in the validity of their theory assumptions, as incorrectly claimed in those references. Defining x ≡ s/s 0 , let us consider each fit separately: • Fit 1: ω 0 (x) = 1. This corresponds to the equivalence we have just shown, for n = 0: (A.14) • Fit 2: ω 1 (s) = 1, ω 2 (s) = 1 − x 2 , ω 3 (s) = (1 − x 2 )(1 + 2x), One has: A trivial linear transformation gives which again reduces to: Removing the repeated data points, which do not carry any information and can only distort the fit, one gets A (0) (s j in ), A (2) (s j in ), A (3) (s j in ), ρ j in +1 , . . . , ρ j end . (A. 18) Taking into account (see discussion above) that in the working condensate approximation A (2) and A (3) are adding as many data points (2) as completely unknown parameters, O 6 and O 8 , the fit for the remaining parameters and for the test of the theory is exactly equivalent to a fit without those two points, leading to A (0) (s j in ), ρ j in +1 , . . . , ρ j end . (A.19) • Fit 3: ω 1 (s) = 1, ω 2 (s) = 1 − x 2 . This is a trivial variation of the previous one. which is the same as Fit 2.
Thus, the comparison among the results for α s obtained from these five different fits constitute a tautological test.

A.3 Other tautological tests
The s 0 -dependence of different moments has been claimed to provide an excellent consistency test of the DV-ansatz approach [48]. However, once A ω J (ŝ 0 ) and the experimental spectral function have been fitted with the parameters of the assumed hadronic model, all moments get determined in the fitted region. For instance, with the weights ω n (x) = x n , one trivially has the exact mathematical identity valid for any value of s 0 >ŝ 0 . The consistency plots shown in Ref. [48] only display a range of s 0 values in the fitted region [ŝ 0 , m 2 τ ], where the agreement with data is guaranteed by Eq. (A.23), since both A ω J (ŝ 0 ) and ρ J (s) have been fitted to data. Any hadronic model would exhibit the same excellent agreement, provided that it fits well the data, independently of the numerical value of α s emerging from it. Therefore, this type of plots are only testing the statistical quality of the multi-parameter fit to the spectral function, and do not provide any information about the actual relation of the assumed ansatz with QCD.
In order to learn something about the ansatz itself, one should compare the model predictions with the data below the fitted region; however, such comparison is never shown. From Figs. 5, 6 and 8, it is evident that this exercise would exhibit a poor behaviour, instead of the claimed excellent performance of the DV model. Let us stress once again that, in contrast with the assumed approximate convergence to the OPE at s 0 = m 2 τ , used in the standard extraction of the strong coupling, the DV-ansatz approach relies on the (exact, since no uncertainty at all is assigned) validity of the hadronic model in the whole energy range fromŝ 0 = 1.55 GeV 2 to s 0 = m 2 τ . This type of plots has also been used as a mean to demonstrate hypothetical failures of the OPE, by zooming scales and not displaying any error bars for the theoretical curves, when being compared to data points that have not been explicitly fitted. One usually displays differences such as A ω J (m 2 τ ) − A ω J (s 0 ) or even double-differences, subtracting the corresponding experimental quantities, in order to magnify the claimed disagreement. As already discussed in section 3, in the standard determination of the strong coupling, the relevant power corrections turn out to be too small to be clearly identified at s 0 ∼ m 2 τ because they get masked by the much larger noise of the perturbative uncertainties. Therefore, the fitted values of the condensates have rather large errors, but their impact on α s is small and has been carefully (and conservatively) assessed. However, if one plots the s 0 dependence of A B Results from DV fits with G J (s) = s λ J Ref. [40] already analyzed the sensitivity of the DV fits to the vector distribution with the ansatz (4.2), using G J (s) = s λ J (GeV 2 units) and different values of λ V between zero and 8, while keeping the ad-hoc choiceŝ 0 = 1.55 GeV 2 . We reproduce in Table 14 the fitted values of the strong coupling and the four ansatz parameters, together with the p-values of each fit. These results exhibit a very strong correlation between the input value assumed for λ V and the output value of α s (m 2 τ ). The worse fit (p-value) corresponds to the default choice λ V = 0 and leads to the smallest α s . As λ V increases, the fit quality improves, while the strong coupling slowly approaches its reference value discussed in section 3. Fig. 8 compares the vector spectral function predicted by the different fitted ansatzs with the experimental data. All models reproduce well ρ V (s) in the fitted region of invariant masses (1.55 GeV 2 ≤ s ≤ m 2 τ ), but they fail badly below it. The worse behaviour is obtained with the default model (λ V = 0). When λ V increases, the predicted spectral function slightly approaches the data below the fitted range, while the ansatz parameters p-value (% ) 0 0.298 ± 0.010 3.6 ± 0.5 0.6 ± 0.3 −2.3 ± 0.9 4.3 ± 0.5 5.3 1 0.300 ± 0.012 3.3 ± 0.5 1.1 ± 0.3 −2.2 ± 1.0 4.2 ± 0.5 5.7 2 0.302 ± 0.011 2.9 ± 0.5 1.6 ± 0.3 −2.2 ± 0.9 4.2 ± 0.5 6.0 4 0.306 ± 0.013 2.3 ± 0.5 2.6 ± 0.3 −1.9 ± 0.9 4.1 ± 0.5 6.6 8 0.314 ± 0.015 1.0 ± 0.5 4.6 ± 0.3 −1.5 ± 1.1 3.9 ± 0.6 7.7 Table 14.  [40] adapt themselves to compensate the growing at high values of s with the net result of a smaller duality-violation correction. The large variation in the output value of α s obtained from the different fits gets obviously reflected in the fitted values of the power corrections that need to adapt themselves in order to reproduce the corresponding experimental moments A (n) V (ŝ 0 ) with a different α s . This is shown in Table 15, which compiles the values of the condensates O D≤16,V obtained with the different choices of λ V . The observed changes are indeed very large, and even the signs get modified in some cases. The absolute size of the condensates decreases in a very sizable way when λ V (and α s ) increases, except for O 16,V . However the most important result from this exercise is the very strong model dependence of the fitted parameters, which are void of any physical meaning.  Table 15. Fitted values of the OPE condensates in GeV units, with FOPT andŝ 0 = 1.55 GeV 2 , for different values of the power λ V .