Higher-order QCD corrections to hadronic $\tau$ decays from Pad\'e approximants

Perturbative QCD corrections to hadronic $\tau$ decays and $e^+e^-$ annihilation into hadrons below charm are obtained from the Adler function, which at present is known in the chiral limit to five-loop accuracy. Extractions of the strong coupling, $\alpha_s$, from these processes suffer from an ambiguity related to the treatment of unknown higher orders in the perturbative series. In this work, we exploit the method of Pad\'e approximants and its convergence theorems to extract information about higher-order corrections to the Adler function in a systematic way. First, the method is tested in the large-$\beta_0$ limit of QCD, where the perturbative series is known to all orders. We devise strategies to accelerate the convergence of the method employing renormalization scheme variations and the so-called D-log Pad\'e approximants. The success of these strategies can be understood in terms of the analytic structure of the series in the Borel plane. We then apply the method to full QCD and obtain reliable model-independent predictions for the higher-order coefficients of the Adler function. For the six-, seven-, and eight-loop coefficients we find $c_{5,1} = 277\pm 51$, $c_{6,1}=3460\pm 690$, and $c_{7,1}=(2.02\pm0.72)\times 10^4$, respectively, with errors to be understood as lower and upper bounds. Our model-independent reconstruction of the perturbative QCD corrections to the $\tau$ hadronic width strongly favours the use of fixed-order perturbation theory (FOPT) for the renormalization-scale setting.


Introduction
The precise determination of the strong coupling, α s , is a key ingredient for calculations of all processes involving perturbative Quantum Chromodynamics (QCD) and represents a fundamental test of the internal consistency of the Standard Model. The value of α s , together with the mass of the top quark, plays, for example, a crucial role in the fate of the Standard Model vacuum [1]. The extraction of α s from hadronic τ decays [2][3][4][5] (and also from e + e − → (hadrons) below charm [6,7]) is of special interest for two reasons. First, because it is done at relatively low energies, close to the limit of validity of perturbative QCD. Therefore, the evolution of α s from the τ mass scale to the Z mass scale represents one of the most non-trivial tests of asymptotic freedom [8] as predicted by the QCD β-function [9][10][11][12]. Second, this determination of α s (m Z ) is competitive, since the running reduces the size of the relative error.
However, theoretical uncertainties still affect the determination of α s from these processes. At and around the τ mass, perturbative QCD is still valid, but non-perturbative effects become non-negligible. These effects are smaller by a factor of about ten when compared to the perturbative QCD contribution, but must be taken into account carefully [3,13]. They are encoded in the condensates of the Operator Product Expansion (OPE) and the related contributions from violations of quark-hadron duality -or simply duality violations (DVs) [14][15][16][17].
Another important source of uncertainty stems from the renormalization-scale setting in the perturbative contribution. Theoretically, the decay τ → (hadrons) + ν τ is expressed in terms of a weighted integral of the hadronic spectral functions that runs over the hadronic invariant mass squared from threshold up to m 2 τ . Since perturbative QCD cannot be trusted at low energies, one resorts to Finite Energy Sum Rules (FESRs) to relate this integral to an integral along a closed contour in the complex plane with |s| = m 2 τ . In this process, a procedure must be adopted to set the renormalization scale. The two most commonly employed procedures are known as Fixed Order Perturbation Theory (FOPT) [18] and Contour Improved Perturbation Theory (CIPT) [19,20] (they are discussed in more detail below). The two represent different ways of treating the unknown higher orders in perturbation theory, and lead to different perturbative series and, therefore, to different values of α s . This remains true in the analogous extraction of α s from e + e − → (hadrons) below charm although, numerically, the difference between the two procedures is smaller in that case [7].
The elimination of this ambiguity is inherently difficult because it requires knowledge about higher orders of the perturbative expansion. At present, the perturbative QCD expansion of the Adler function in the chiral limit is known up to α 4 s thanks to the five-loop computation of Ref. [21,22] and it is unlikely that the result at six loops will be available anytime soon [23]. In the absence of exact calculations for the higherorder coefficients, one must tackle this problem with methods that allow for a partial reconstruction of the series based only on the available information.
The general structure of the perturbative series is assumed to be known. It is an asymptotic series (therefore divergent) with coefficients that grow factorially. The divergent behaviour of the series is governed by renormalons: singularities along the real axis of the Borel transformed series [24]. The position of these singularities is known since they are related to the dimension of operators that participate in the OPE of the relevant correlator. The exponents of the singularities are related to the anomalous dimension of the same operators and can, in principle, be calculated. On the other hand, nothing, essentially, is known about the residues of the singularities.
This partial knowledge has been used to construct realistic representations of the full series by approximating its Borel transform, which has an infinite tower of singularities, by a small number of dominant ones [18,25]. These models for the Borel transformed series are, in some cases, a type of rational or Padé approximant. 1 Motivated by this observation, in this paper we investigate, systematically, the use of Padé theory [26][27][28][29] to reconstruct the Adler function, which governs hadronic τ decays and the cross section of e + e − → (hadrons). One of the main advantages of the use of rational approximants, as compared to the so-called renormalon models of Refs. [18,25], is that they can be made model independent. Additionally, in some well defined cases, theorems guarantee the convergence of a sequence of approximants to the function of interest.
In the past, rational approximants have already been used in the context of τ decays [30]. In particular, the observation that their convergence can be improved when one uses the Borel transformed series, as opposed to the series in α s , was already made. (This procedure is sometimes referred to as "Padé-Borel method" [31].) At that time, however, the perturbative series was known to one order less, only up to α 3 s . Moreover, the connection with renormalons in applications to τ decays was not made explicitly 2 . Here, this connection is established and we are able to use different types of Padé approximants (such as partial Padé approximants) thanks to the available knowledge about the renormalon singularities.
To validate our approach, before applying Padé approximants (PAs) to full QCD, we will work within the large-β 0 limit. In this limit, one obtains all the corrections with highest power of N f at every given α s order in the perturbative expansion (with N f being the number of light-quark flavours). The application of this procedure to τ decays generates an asymptotic series in α s that is known to all orders, and the result for the Borel transformed Adler functionn can be written in a compact form [33,34]. This Borel transform is a meromorphic function in the complex plane: it has a finite radius of convergence and an infinite number of renormalon poles along the real axis, but no branch cut. Therefore, the theory of PAs to meromorphic functions, and in particular Pommerenke's theorem, apply [26,35]. Apart from the standard PAs, we will consider several strategies for accelerating the convergence of the approximation. First, we will exploit the renormalization scheme dependence of the perturbative series, following Ref. [36], to optimize the convergence of the Padé approximants to exactly known Adler function. Additionally, we will consider partial Padé approximants (which exploit the available knowledge about the renormalon singularities). We will also employ D-log Padé approximants [26] which can also be very effective in approximating functions with branch cuts. Finally, we investigate the application of the different PAs to the FOPT expansion of the QCD corrections to the τ hadronic width. This series has a much simpler analytic structure in the Borel plane, which leads to coefficients that follow a more regular pattern, and is more amenable to approximation by rational functions. From these approximants one can easily perform an indirect reconstruction of the Adler function that is very reliable and requires little information.
The systematic study performed in large-β 0 and the lessons we learn from this limit are used as the basis for the QCD analysis. In the case of full QCD, the structure of the Borel transformed Adler function is more involved, since the poles become branch cuts [24]. From renormalization scheme variations we find indications that in QCD the leading UV renormalon is suppressed with respect to large-β 0 and, as a consequence, the sign alternation of the series is probably postponed. We will then show that, as in large-β 0 , it is advantageous to consider Padé approximants to the FOPT expansion of the corrections to the τ hadronic width. From these approximants, we are able to obtain reliable model-independent predictions for the higher-order coefficients of the Adler function, together with an estimate of their uncertainty, and extract an estimate for the ressumed value of the perturbative QCD corrections to hadronic τ decays. We are also in a position to discuss the renormalization group improvement of the series and we show, from our reconstruction of high orders, that FOPT is strongly favoured in QCD. The systematic use of PAs lead to results that are model independent and that have significantly smaller errors than results obtained from other methods.
Our work is organized as follows. In Sec. 2 we describe the essentials about the QCD description of hadronic τ decays and, in Sec. 3, we collect the main facts about Padé theory relevant to this work. Then, in Sec. 4, we apply the Padé approximants to the large-β 0 limit of QCD. The results in full QCD are presented in Sec. 5 and the conclusions are given in Sec. 6.

Perturbative QCD in hadronic τ decays
In the study of perturbative QCD corrections to hadronic τ decays and e + e − → (hadrons) below charm the central object is the Adler function in the chiral limit (defined below). From the knowledge of its expansion one can derive the corrections to the τ hadronic width [18] as well as the perturbative expansion of R(s) for e + e − → (hadrons) below charm [7]. In the spirit of being self-contained, we review here the main aspects of the theoretical description of hadronic τ decays in QCD. (We refer to Refs. [2,18,40] for further details.) Although we frame our discussion in the context of τ decays the application to e + e − → (hadrons) is straightforward and is, essentially, a matter of normalization to reflect the fact that the current in that case is the electromagnetic one. For the details regarding this normalization we refer to Ref. [7].
The decay rate of τ → (hadrons) + ν τ can be separated experimentally into three components: a vector and an axial-vector, due to decays mediated by the light-quark ud current, and strange contributions, arising from theūs current. These decay rates, normalized to the width of τ → eν e ν τ , are denoted R τ,V , R τ,A , and R τ,S , respectively. When extracting α s it is convenient to work only with light quarks, because corrections proportional to the mass of the quarks can then be safely neglected. Here, we restrict ourselves to the non-strange channels, precisely for this reason. Then, the different corrections to the partonic result can be parametrized as where S EW and δ EW are small electroweak corrections and V ud the CKM matrix element; the unity, in between square brackets, is simply the partonic result. The first correction, δ (0) , is the perturbative QCD part, which is the dominant contribution (∼ 20%). Nonperturbative contributions, encoded in δ NP , are significantly smaller and contain both OPE condensates and DVs. In this work, we focus on the perturbative QCD part, δ (0) , that we discuss in more detail below. The relevant quantity that governs R τ,V /A are the correlators where |Ω represents the physical vacuum and the currents are J µ V /(A) (x) = (ūγ µ (γ 5 )d)(x). These correlators can be decomposed into transverse, Π V /A (s), parts in the usual way (with s = p 2 ). The decay rate can be expressed in terms of integrals over the spectral functions, 1 π ImΠ J V /A (s), that run from s = 0 to s = m 2 τ [2,18]. These integrals, on the theory side, are problematic because perturbative QCD is not valid at low energies. To circumvent this problem, one resorts to a FESR that exploits the analyticity properties of the correlators. The quantities R τ,V /A can then be expressed as an integral in a closed contour in the complex plane with fixed |s| = m 2 τ . Explicitly, for the perturbative contribution, one finds [18] with x = s/m 2 τ and where the weight function W (x), determined by phase space, is pert is the perturbative contribution to the reduced Adler function defined as where the Adler function itself, D (1+0) , is obtained as the logarithmic derivative of the combination Π (1+0) (s) as The Adler function is a physical object in the sense that it does not contain subtraction constants that depend on the renormalization conventions. This function is central to our work. The perturbative expansion of D starts at O(α s ) and can be cast as where L = log(−s/µ 2 ) and a µ = α s (µ)/π. In this expansion, the only independent coefficients are the c n,1 ; the others can be obtained imposing renormalization group (RG) invariance, and are expressed in terms of the c n,1 and β-function coefficients [18,41]. The logarithms can be summed with the scale choice µ 2 = −s ≡ Q 2 giving where r n = c n+1,1 /π n+1 . With this definition, the perturbative expansion of the reduced Adler function with the choice µ 2 = Q 2 then reads (for N f = 3, MS scheme) 4 D(Q 2 ) = a Q + 1.640 a 2 Q + 6.371 a 3 Q + 49.08 a 4 Q + · · · , from which the numerical values of the known independent coefficients c n,1 of Eq. (7) can be immediately read off. The perturbative series of Eq. (6) is divergent and one assumes that it must be an asymptotic expansion to the true value of the function being expanded [24]. To study the perturbative contribution to the Adler function, and in particular its renormalon content, it is therefore convenient to work with the Borel transformed series, which can have a finite radius of convergence, defined as The original expansion, in turn, can be understood as an asymptotic series to the inverse Borel transform provided that the integral exists. The last equation defines the Borel sum of the asymptotic series. The divergence of the original series, D, is translated into singularities in the t variable. Two types can be distinguished: ultraviolet (UV) and infrared (IR) renormalons. The UV renormalons lie on the negative real axis and contribute with sign alternating coefficients. IR renormalons are singularities on the positive real axis which contribute with fixed sign coefficients. The latter obstruct the integration in Eq. (10) and generate an ambiguity in the inverse Borel transform which is expected to cancel against power corrections of the OPE. The position of the singularities in the t plane can be determined with general renormalization group (RG) arguments. They appear at positive and negative integer values of the variable u ≡ β 1 t 2π (except for u = 1), where β 1 is the leading coefficient of the QCD β-function. 5 The UV renormalon at u = −1, being the closest to the origin, dominates the large order behaviour of the series, which must, therefore, be sign alternating at higher orders. As seen in Eq. (8), this sign alternation is still not apparent in the first four coefficients of the QCD expansion in the MS scheme, which are known exactly.
Finally, to obtain the perturbative corrections to R τ,V /A one needs to perform the integral in Eq. (3). In the process, one needs to adopt a procedure in order to set the scale µ, which enters, implicitly, through Eq. (6). A running scale, µ 2 = Q 2 , as done in Eq. (7), gives rise to the aforementioned Contour-Improved Perturbation Theory (CIPT), where the running of α s along the contour is resummed to all orders. In this case, δ (0) can be written as Another option is to employ a fixed scale µ 2 = m 2 τ , which gives rise to Fixed Order Perturbation Theory 6 . Then, because α s is evaluated at a fixed scale, it can be taken outside the contour integrals, which are performed over the logarithms that appear in Eq. (6) as FO can also be written as an expansion in the coupling where the coefficients d n depend then on c n,1 , on the β-function coefficients, and on the integrals J FO n . In QCD, this expansion reads, for N f = 3 and in the MS scheme, where we give the numerical result of the known contributions to the first unknown coefficient. 5 We define the QCD β-function as in Ref. [18] In τ decays, using δ (0) CI to extract α s (m 2 τ ) one obtains results about 5% larger than those obtained from δ (0) FO [3]. (This difference is reduced to about 2% when α s (m 2 τ ) is extracted from e + e − → (hadrons) [7].) The elimination of this ambiguity would therefore contribute to the extraction of α s around m 2 τ with smaller uncertainties.

Elements of Padé theory
Let us consider a function f (z) that assumes a series expansion in the complex plane around z = 0 A Padé approximant (PA) to f (z) [26], denoted P M N (z), is defined as the ratio of two polynomials in the variable z of order M and N , Q M (z) and R N (z), respectively, with the definition R N (0) = 1. This approximant is said to have a "contact" of order M + N with the expansion of the function f (z) around the origin of the complex plane: the expansion of P M N (z) around the origin is the same as that of f (z) for the first M + N + 1 coefficients From the reexpansion of the approximant P M N (z) one can read off an estimate for the coefficient f M +N +1 , the first that is not used as input [28]. Estimates of this type will be of special interest in this work.
The successful use of Padé approximants to obtain quantitative results about the function f (z) requires only a qualitative knowledge about the analytic properties of the function. The PAs can also be used to perform a reconstruction of the singularity structure of f (z) from its Taylor expansion. Convergence theorems exist for the cases of analytic and single-valued functions with multipoles or essential singularities [26]. Even for functions that have branch points the PAs can be used, in many cases, successfully. In these cases, for increasing order of approximation, the poles of the PAs tend to accumulate along the branch cut, effectively mimicking the analytic structure of the function [26].
We will focus on sequences of Padé approximants with N = M + k, for a fixed value of k. For k = 0, the PAs P M M +k define a near diagonal sequence while the case k = 0 defines the diagonal sequence. Pommerenke's Theorem [35] then guarantees that a sequence P M M +k to a meromorphic function is convergent in any compact set of the complex plane, except in a set of zero area that includes the poles of the function f (z), where even the original function is not well defined. If there are other nuisance poles in the approximant, the theorem requires that they move away from the region as soon as M grows, or appear in combination with a nearby zero, which is called a defect or Froissart doublet [26]. In contrast, poles that are present in f (z) tend to be relatively stable as one increases the order M .
In this paper, most of the times, the role of the function f (z) is played by the Borel transform of the Adler function, defined in Eq. (9). A key feature of the Borel transform, as already discussed, is its singularities along the real axis, the renormalons. It will be of interest to us to study how this singularity structure is mimicked by the PAs. It is important to note that when f (z) is a general meromorphic function some of the poles (and residues) of the approximant P M N (z) may become complex, even though the original function has no complex poles. 7 Such poles cannot be identified with any of the renormalon singularities, but they do not prevent the use of P M N (z) to study the function away from these poles. In fact, in the process of approximating a function with an infinite number of poles by an approximant that contains only a handful of them, the appearance of these extraneous poles is expected to happen [27].
In the case at hand there is some available knowledge about the singularities of the functions being approximated, which are precisely the renormalon singularities of the Borel transformed Adler function. It may be desirable to construct approximants that exploit this knowledge. If a set of poles of the function are known, one can construct a so-called partial Padé approximants (PPA) [48] defined as .
The polynomial T K (z), of order K, is constructed such as to have K zeros at the exact location of the first K poles of f (z). The coefficients of the polynomials are again fixed by matching to the Taylor expansion of f (z), in the same way as for the PAs. Again, for a general meromorphic function, complex-conjugated poles may appear in the PPAs. The extreme case N = 0 results in a Padé-type approximant, an approximant with the whole denominator given in advanced, less expensive in terms of Taylor coefficients than a PA or a PPA. The approximation of functions with branch points and cuts -as is the case for the Borel transform of the Adler function in QCD -is more subtle. In this case, a possible strategy is the manipulation of the series to a form which is more amenable to the approximation by Padés. Let us consider the particular case of a function f (z) = A(z) (µ−z) γ + B(z) with a cut from µ to ∞ with exponent γ and a reminder B(z) with little structure (both A(z) and B(z) are to be analytic at z = µ). Following the method of Baker called D-log Padé approximant [26], we can form PAs not to f (z) but to which turns out to be a meromorfic function to which the convergence theorem applies. The use of appropriate Padé approximants to F (z) determines in an unbiased way both the pole position, z = µ, and the residue, −γ, which corresponds to the exponent of the cut of f (z). No assumption about neither µ nor γ is made; they are determined directly from the series coefficients. The approximation of F (z) by a PA yields an approximant for f (z) that is not necessarily a rational function. To be more specific, the Dlog-PA approximant to f (z) obtained from using P M N to approximate F (z), that we denote where P M N (z) = Q M (z) R N (z) is the aforementioned PA to F (z). Due to the derivative in Eq. (18), the constant f (0) is lost and must be reintroduced in order to properly 7 When the meromorphic function is of the Stieltjes type the poles will always be along the real axis. The functions we approximate in this work are not of this type. We will discuss this in more detail in the remainder. normalize the Dlog M N (z). In practice, the non-rational approximant Dlog M N (z) can yield a rich analytical structure, in particular the presence of branch cuts -not necessarily present in the function f (z) -is to be expected.
In case the branch point would be known in advanced, one can form what we will call partial D-log Padé approximants. They consist in forming Padé approximants to for an assumed value of µ, which would yield a prediction for γ by evaluation of the approximants around z = µ. The approximant to f (z) entailed by this procedure will be denoted Dlog M N (z; µ) and is given by One should remark that in Eqs. (20) no assumption is made about γ. The method was originally designed to be used in the presence of branch points, but if γ is an integer it can also work very well, as we show in Sec. 4.1.4. In summary, the Padé approximants P M N (z) can be viewed as an economic and completely model-independent procedure, since all the poles are left free and no analytical information about the singular structure of the function needs to be included. They are, however, expensive in terms of series coefficients. In order to fasten the convergence range, the use of PPAs P M N,K improves the results but requires knowledge about the singularities of the function f (z). Such singularities may be determine by PAs or by external information.
D-log Padé approximants, in turn, offer the possibility to exploit the Padé theory for functions with mutlipoles and branch cuts at the expense of losing, due to the required derivative, the first Taylor coefficient. Finally, the partial D-log Padé approximant such as Eq. (21), provides further improvement but requires knowledge about the position of the singular points. At the end of the day, for each case of interest, the PA practitioner shall decide for the best strategy. As we will show in the next section, a sequential study using the different aforementioned approximations is the optimal way to extract information about unknown Taylor coefficients and about the singular structure of the objective function.
4 Padé approximants in the large-β 0 limit A good laboratory for the strategy we present here is the so called large-β 0 limit of QCD. Results in this limit are obtained by first considering a large number of fermion flavours, N f , keeping α s N f ∼ 1. In this framework, the qq bubble corrections to the gluon propagator must be resummed to all orders. Using this dressed gluon propagator one can then compute all the corrections with highest power of N f at every α s order to a given QCD observable [24]. The results in large-β 0 are obtained by replacing the N f dependence by the leading QCD β-function coefficient (β 1 in our notation) which incorporates a set of non-abelian gluon-loop diagrams. Accordingly, the QCD β-function is truncated at its first term. 8 In this limit, the Borel transform of the reduced Adler function, defined in Eq. (9) can be written in a closed form as [24,33,34] where the scheme parameter C measures the departure from the MS, which corresponds to the choice C = 0. 9 In the conventions of Eqs. (7) and (9) we have u = β 1 t/(2π) with β 1 = 9 2 (for N f = 3). The result clearly exhibits the renormalon poles, both the IR, that lie along the positive real axis, and the UV ones, that appear on the negative real axis. They are all double poles, with the sole exception of the leading IR pole at u = 2, related to the gluon condensate, which is a simple one. This Borel transformed Adler function is a meromorphic function but it is important for the subsequent discussion to note that it is not of the Stieltjes type as can be proved by the computation of the determinantal necessary-conditions for a function to be of Stieltjes type [26] -and can be easily seen by the alternating sign of the different renormalon contributions. This already anticipates the presence of complex-conjugated poles in our approximants.
The Borel transform of the FOPT correction to the decay rate, B[δ (0) ], can be obtained inserting Eq. (22) in the expression of Eq. (3). The contour integral can be done using the one-loop logarithmic running of α s to give [18] The analytic struture of this last Borel transform is much simpler than that of B[ D](u). Now all the UV poles are simple poles, because of the zeros of sin(πu). For the same reason, the leading IR pole of B[ D](u), at u = 2, which is simple in large-β 0 , is cancelled in B[δ (0) ](u) -a result first pointed out in Ref. [38] for the Borel transformed spectral function. Our analysis with PAs benefits greatly from these cancellations since the Borel transformed function is now much less singular. 11 A simpler analytic structure can be much more easily mimicked by the PAs. We also note that the leading UV pole has a residue about ten times smaller than in the Adler function counterpart. This, together with an enhancement of the residue of the double pole at u = 3, postpones the sign alternation of the series and enlarges the range of convergence of the Taylor series. PAs constructed to the expansion of Eq. (23) benefit from these features of B[δ (0) ](u) and lead to smaller errors by virtue of Pommerenke's theorem, granting better coefficient's determination [26]. The coefficients c n,1 of the reduced Adler function can be reconstructed from the Borel transform by performing the expansion around u = 0 and using Eqs. (4) and (9). The first six coefficients of the Adler function in the large-β 0 limit, denoted D Lβ , read to be compared with their QCD counterparts given in Eq. (8). We observe that the sign alternation due to leading UV renormalon sets in at the sixth order (in the MS). These coefficients lead to the following large-β 0 FOPT expansion of δ (0) : to be compared with Eq. (14). Now the sign alternation of the coefficients is postponed and sets in only at the 9th order because of the suppression of the leading UV pole in Eq. (23). In comparison with the results in full QCD, the large-β 0 limit is a good approximation, in the case of the Adler function, only up to α 2 s . However, for δ (0) FO,Lβ this approximation is still good up to the last known term, i.e. α 4 s . The reason for this better agreement lies in the fact that these coefficients depend also on the β-function coefficients -which are largely dominated by β 1 in QCD -as well as on the integrals of Eq. (12).
An important difference between the a Q expansion of the Adler function and that of δ (0) , Eqs. (24) and (25) is that, in the former, the smallest term of the sum is reached already at the fourth order. This makes the asymptotic nature of the series very prominent. In FOPT, the series is much better behaved and each term is consistently smaller than the previous up to the 9th order. The FOPT series, therefore, behaves at intermediate orders almost as a convergent series -a fact that will be important in the remainder of the paper. (These features can be visualized in the results represented by solid lines in Figs. 1 and 2.) It can be shown analytically that in the coefficients d n of Eq. (13) there are cancelations between the Adler function coefficients and the remainder contributions [18]. These cancellations lead to the fact that FOPT is, in large-β 0 , superior to CIPT (which misses them). Aditionally, the cancellations suppress to some extent the divergent character of the series, which is postponed with respect to the Adler function.
A special feature of the large-β 0 result is the simple way in which the scheme dependence appears through the factor e (C+5/3)u in Eq. (22). It becomes clear that the residue of the renormalon poles is scheme dependent, while their position, related to the dimension of operators in the OPE, is not. Physical results must, of course, be scheme independent. However, the coupling α s is itself not physical, since it depends on conventions related to the renormalization procedure. Therefore, a perturbative expansion in α s is a scheme-dependent approximation to an (unknown) scheme-independent physical result.
In this context, the physical result is given by the Borel integral Eq. (10) in which the scheme dependence of the Borel transform is cancelled by the scheme dependence of the coupling α s , denoted by α C s , to make it explicit. Writing the Borel transform as the function b(u) is scheme independent and we have which exposes the scheme invariant combination This result allows us to write the coupling α C s ≡α in terms of the more usual MS coupling as Redefinitions of the QCD coupling of this type have been discussed in Refs. [36,37]. The result we employ here is a particular case of those of Ref. [36], with higher order β-function coefficients set to zero. Since the QCD β-function is dominated by β 1 , the qualitative behavior ofα s with C remains the same as in Ref. [36]: grosso modo, negative values correspond to largerα s whereas positive values of C are associated with smaller α s values.
Here, we will exploit the freedom of scheme choice in order to optimize the rational approximation of the Borel transformed Adler function. It is particularly important to note that the schemes with negative C values, therefore less perturbative, introduce a suppression of the IR pole residues. In these schemes, IR pole contributions are largely dominated by the first few poles, much more so than in schemes with C > 0. On the other hand, for C < 0 the UV poles are enhanced, and one expect the sign alternation of the series to show up at very low orders. For perturbative calculations, these schemes with largerα s values are essentially useless, but we will show that they are more amenable to a rational approximation as C < 0 suppresses the influence of the exponential term in Eq. (22) and results in a function with more pronounced isolated poles, easier to reproduce with a rational function than an exponential one.

MS scheme
We begin by using Padé approximants to study the perturbative expansion of the Adler function and of δ (0) in the MS scheme. Since in large-β 0 we know the exact result we are able to assess the quality of the approximation and refine the method that later we will apply to QCD. In the remainder of this section, we devise a strategy to extract as much information as possible about the series using rational approximants.
Let us first comment on the construction of Padé approximants directly to the series in α s /π, given of Eq. (24). In the case of the Adler function, the asymptotic nature of the series is very prominent since from the 5th term on asymptoticity has already set in. Forming Padé approximants to the Adler series in a Q requires many coefficients as input in order to allow for an acceptable description of higher orders. The Borel transformed Adler function, which suppresses the factorial growth of the coefficients fixes, at least partially, this behaviour and is much more amenable to the approximation by PAs. This has been noted already in Ref. [30] and we refrain from further discussing PAs constructed for the α s /π expansion of the Adler function. (PAs of this type will turn out to be useful, however, in the case of δ (0) , as discussed in Sec. 4

.2)
We turn now to Padé approximants formed to the Borel transformed Adler function. The first question that arises regards what Padé sequence(s) to use. In our conventions, the MS corresponds to C = 0, which means that the exact Borel transform diverges exponentially when u → ∞. Since we do not have a simple power-like behaviour for large u we do not attempt to fix the Padé sequence using this limit. Instead, we will investigate more than one sequence, keeping in mind that in QCD we have only the first four coefficients of the Adler function available.
Let us begin with the sequence P N +1 N (u). Since we are interested in the prediction of the behaviour of the series at higher orders, we start by studying the quality of the estimate of the first coefficient not used as input. Each P N +1 N needs 2N + 2 parameters to be constructed and we employ the first 2N + 2 coefficients of the Borel transformed Adler function. Then the Padé is reexpanded to predict the coefficient c 2N +3,1 . In the first non trivial case, that of P 2 1 (u), we need c 1,1 , c 2,1 , c 3,1 , and c 4,1 to fix the parameters and we find from which we extract the value 52.33 for c 5,1 to be compared with the exact value 787.8, given in Eq. (24) -clearly not an accurate prediction. For the FOPT series this leads to the value 165.3 for the fifth coefficient, to be compared with 900.8 in Eq. (25). The approximant P 2 1 (u) displays an effective pole at 4.271, which cannot be straightforwardly associated with any of the poles of the exact Borel transform. This is the aforementioned feature of low-order Padé approximants: they mimic the infinite tower of poles by the appearance of effective poles not present in the original function. In a Padé with several poles, only the first few, closest to the origin, can be identified with the poles of the original function and only in a hierarchical way [27]. 12 It is illustrative to consider the next Padé approximant in this sequence, P 3 2 (u), and compare the results. In this case, we find Now the first coefficient that can be forecast is c 7,1 for which we find the value 125,745, to be compared with the exact value 98,572.8. The 7th order coefficient of the FOPT series is also much better forecast: 59,456.6 to be compared with the exact coefficient 32,284.3. This approximant is able to reproduce the qualitative behaviour of the Adler function and the FOPT series at higher orders, mainly due to the fact that it exhibits a pole at u = −0.8024, which mimics the leading UV pole at u = −1, and a second pole mimicking the first IR pole, but slightly below the exact value at u = 2. The UV pole is enough to ensure the correct sign alternation in the higher order coefficients, as shown in Fig. 1 (throughout this paper we use α s (m τ ) = 0.3160, which corresponds to the most recent PDG recommendation evolved to the τ mass scale [49]). A clear feature starts to emerge already at this level. To get an appropriate approximation to the Adler function in the MS, at least two-pole approximants must be considered. This provides the balance between UV and IR renormalon contributions. A visual account of the quality of these approximants is given in Fig. 1 which displays the exact Adler function in large-β 0 and the result reconstructed from P 2 1 (u) and P 3 2 (u), whereas the results for δ (0) in FOPT and CIPT are shown in Fig. 2.
Using the same amount of information, we could consider the P N N +1 (u) sequence with the  series) so we should expect better c 5,1 and c 7,1 predictions for this sequence. We actually find c 5,1 = 1770 and c 7,1 = 102, 889 respectively, a better determination than their P N +1 N (u) counterparts. It is thus interesting to investigate systematically the convergence of the Padé sequence with respect to N . In order to quantify the quality of the prediction of coefficients not used as input we define the relative error as where c P n,1 is the coefficient extracted from the Padé approximant. If the PA sequence converges, the parameter σ rel should tend to zero as n grows. In Fig. 3(a) we show the behavior of the relative error of the estimate of the c 2N +3,1 coefficient for the sequence P N +1 N as a function of N . Being the Adler function in largeβ 0 a meromorphic function, the Pommerenke's theorem ensures the convergence for a larger set of PA sequences than the P N +1 N . To show this excellent global convergence pattern, we collect in Fig. 3(a) few of the closest-to-diagonal sequences, in particular the P N +1 N , P N N , P N N +1 , and P N N +3 . The results are qualitatively similar for all sequences: in all cases the convergence to the exact results happens fast as N grows. We observe in Fig. 3(a) that the improvement once N is increased by one unity is often of about one order of magnitude or more (notice the log scale of the plot). There are a few outliers where increasing N by one unity does not represent an improvement, or even makes the relative error larger. A close scrutiny of the particular approximants reveals why this is so. For meromorphic non-Stieltjes functions, it is a feature of the Pommerenke's theorem [26,27] that the convergence pattern can be altered by the presence of defects, transients poles almost cancelled by a close-by zero as it is clearly noticed after observing the convergence pattern of the pole positions for the P N N +1 sequence: P 1 2 (u) has poles at u = −0.56 and at u = 0.89; P 2 3 (u) at u = −0.86 together with a couple of complex-conjugated (CC) poles at u = 1.43 ± 0.46i -notice the stability of the UV pole which is driving the large-order behavior of the series (since it is always the closest to the origin)-then, P 3 4 (u) contains poles at u = −0.85, the CC poles at u = 1.40 ± 0.47i and an extraneous new pole closer to the origin at u = −0.3991. This last one, which would eventually spoil the convergence, is a new pole and it is actually canceled by a close-by zero at u = 0.3989 effectively reducing the order of this approximant to a P 2 3 . A similar feature is observed for the second and fifth elements of the P N N +3 sequence. In this last case, the cancellation among zero and pole is of the order of 10 −13 , i.e., the residue of the spurious pole is O(10 −13 ). Identifying these cancelations will be important in the obtention of the final results of this paper.
When N = 4, in all sequences the results are better by a few orders of magnitude compared to the first PA in each sequence. We remark that the convergence of some of these sequences was already studied in Ref. [30] and we confirm and extend their results.
In addition, PAs can be used as a way to resum the original series. In our case, the Borel integral can be performed with the reconstructed PAs using Eq. (10) in order to produce an estimate for the Borel sum of the series. The results for the Borel sum of the series also approach the true value when the order N is increased, as can be seen in Fig. 3(b). As expected, the approximation becomes increasingly better, but it is also worth noting that the error is significantly reduced only after a relatively large value of N is employed.
All in all, we arrive to the observation that PAs with larger values of N tend to have poles that can be identified with the leading renormalons -at least for the ones that are closer to the origin in a hierarchical way [27]-and this is enough to yield a good approximation to the series coefficients and the Borel integral. In all cases, however, realistic values of N (N = 1 for P N +1 N and P N N +1 and N = 0 for P N N +3 ) still do not provide a good approximation, as can be seen in Figs. 1, 2, 3(a) and 3(b). Since in full QCD only the first four coefficients of the Adler function are known it would be desirable to obtain a better approximation for these lower values of N . In the next section we discuss how scheme variations can be used to that end.

Scheme variations
The exact result for the Borel transformed Adler function in large-β 0 , Eq. (22), displays explicitly an important property: the residues of the renormalon poles are scheme dependent but their position is not. Here we exploit this feature in order to improve upon the results of the previous section.
One of the difficulties in using PAs with a small number of parameters is the fact that they must mimic an infinite tower of renormalon poles and their complicated interplay with a set of only a few poles. In the MS, in lower orders, the first UV and IR poles give sizeable contributions to the coefficients [18,25]. However, using schemes with negative values of C the Borel transform becomes much more dominated by the first UV pole, due to the factor of e (C+5/3)u , and the sign alternation should show up at very low orders. Because of this dominance, it becomes easier for a PA to reproduce such a series (we will elaborate on that below). Of course, negative values of C entail larger values ofα s [36], as per Eq. (29). These schemes are therefore bad for perturbative calculations but they are very useful, for example, to obtain estimates for the Borel sum of the series, since this is a scheme-independent result. Finally, the results for the coefficients with negative C values can be translated to the MS using the relation between the two couplings, and we will show that this leads to better results for the predicted higher order coefficients in MS.
For definiteness, we will work with C = −5/3 which cancels exactly the exponential in Eq. (22). In this scheme the central value of the coupling isα s (m τ ) = 0.5074. The expansion of the Adler function in the large-β 0 limit for C = −5/3 reads (N f = 3) We remark that, as expected, the sign alternation sets in much earlier, in this case it starts from the second coefficient. The exact result for the Adler function in the large-β 0 limit for C = −5/3 can be seen in Fig. 4. It becomes clear that asymptoticity sets in also earlier due to the much larger value of the expansion parameter,α s .
Let us consider again the Padé P 2 1 (u), but now in the scheme C = −5/3. In this new scheme we find Now, for the coefficient c 5,1 the result is 1423, to be compared with the exact value 1610 in Eq. (33), a relative error of only 12% -this is an improvement of about an order of magnitude with respect to the result in the MS. The result for c 6,1 , one order higher, is still very good: −18, 212, just a 12% relative error. Also, this P 2 1 (u) already displays a pole at −0.8790, close to the leading UV renormalon, and which reproduces the sign alternation of the coefficients. Finally, the Borel integral gives now 0.1416, much closer to the exact value (which is 0.1481 ± 0.0030i, shown in Fig. 3(b)). In Fig. 4, we see that the agreement with the exact Adler function is good even at higher orders. Clearly, P 2 1 (u) in this scheme provides a much more accurate prediction of the true function than its counterpart in the MS. The underlying reason is simple: without the exponential term, the position of the most prominent renormalon pole, the first UV, is much better determined in full agreement with the Pommerenke's theorem [26,27]. Furthermore, the suppression of the IR-pole residues simplify the interplay of the poles. Enlarging the sequence will only improve on the results.
Actually, for C = −5/3, the improvement for the P 3 2 (u) is only modest. We find whit a prediction of the 7th coefficient has now a relative error of 10% and pretty stable position of the pole closest to the origin: −0.8248. The Borel integral is again well predicted: 0.1394 + 0.0048i. For the next element in the sequence, P 4 3 (u), the prediction of the 9th coefficient has an error of mere 0.19%, about two orders of magnitude better than in the MS. Interestingly, for the first time, this PA has two poles close to u = −1, one at −1.1312 and a second at −0.9385 which mimics the fact that the leading UV pole is, actually, a double pole. It also has a pole at 1.788, rather close to the location of the first IR pole, which lies at u = 2. Finally, its Borel integral is also almost on top of the true one: 0.1476 ± 0.0084 i, the real part is off by only 0.3%.
Again, this excellent convergence pattern is not particular for this sequence. In Fig. 5(a) we show the relative error of the first predicted coefficient using the same four sequences of Fig. 3(a). The comparison with the results in the MS clearly shows the advantage of using less perturbative schemes. For instance, the results for P N +1 N for N = 1 are as good as those of the MS with N = 3, but require four parameters less. The results for the scheme-independent Borel sum predicted by these Padé sequences follow suit and are also significantly better than in the MS, as Fig. 5(b) shows. We should remark that for most of the Padés the results are even better if C is lowered below −5/3. 13 This means that the improvement in the results should not be directly attributed to the cancellation of the exponential term but rather to the value of the argument of that exponential. It is well-known [26] that the pole positions of diagonal and near-diagonal PA to an exponential function share three characteristics: the location corresponds with the sign of the exponential argument (positive argument, positive location in the complex plane with positive real part, and vice-versa), as soon as the PA order increases, the poles move further away from the origin, and finally for a given PA, the position of its poles is located within an area defined by its "order star" (cf. Ref. [26] for further definitions). These implies that as soon as the scheme is less perturbative, the PA poles responsible for the exponential term accumulate in the UV 13 For P 2 1 , e.g., C = −2 provides an excellent description of the Adler function up to order 9 with only 4 parameters. However, other negative values of C are somewhat arbitrary and the results must be checked for stability. Table 1: Coefficients of the Adler function in the large-β 0 approximation in the MS scheme. The first row gives the exact values. Darker rows show the results obtained from P 2 1 (u), P 1 2 (u) and P 3 2 (u) constructed in a scheme with C = −5/3 and later evolved to MS using Eq. (36). For comparison, we also display the results obtained through the use of these PAs constructed directly in the MS (second, fourth and sixth rows). region further away as the PA order increases, isolating the rest of the poles coming from the non-exponential term, in particular the dominant UV double pole. On the contrary, for more perturbative schemes, PA poles will accumulate in the IR region and shadow the UV pole. From this perspective it is then natural to expect better convergence with respect to series coefficients and the Borel integral for the less perturbative schemes. It is apparent as well that the C-scheme analysis sheds light on the analytical structure of the Adler function in a clear way.
Let us now translate the results obtained in the scheme with C = −5/3 to the MS. The expansion of Eq. (29) gives the following perturbative relation between the different schemesâ where on the r.h.s we have a Q in the MS and on the l.h.s.â Q ≡ a C Q . Applying this perturbative relation to an expansion such as Eq. (33) one is able to reconstruct the MS result from its counterpart in a scheme with a different C value. Performing this procedure for the results of P 2 1 (u), P 1 2 (u) and P 3 2 (u) constructed at C = −5/3 leads to much better predictions of the higher order coefficients, as Tab. 1 confirms. The predictions are far superior to those obtained when the PA is constructed directly in the MS.

Partial Padé Approximants
We have seen that Padé sequences appear to converge rather fast to the Borel transform of the Adler function, as expected following Pommerenke's convergence theorem. In realistic applications, however, where one has only the first four coefficients of the series it may be necessary to employ a method to accelerate the convergence such as the scheme variation we discussed previously. In this section we will show that using knowledge about the position of the renormalon singularities significantly accelerates the convergence and yields excellent results even for the lowest approximants. We then consider now partial Padé approximants (PPAs) as defined in Eq. (17).
Most of the effort done by the PAs in the previous sections was to locate the double pole at u = −1, and this had a cost of two series coefficients. It is then to expect that including the double pole in advanced should allow the approximants to unfold the subdominant renormalon poles. For the MS scheme, a P 2 1,2 (u) (imposing the double pole at u = −1), leads to a prediction of c 5,1 which is 60% off, while c 5,1 from P 1 2,2 (u) is 55% off. This represents a significant improvement with respect to the PAs. Results are much better for C = −5/3 where the predictions reach precision better than 3% for both. In both schemes, pole predictions suggest the subdominant renormalons to be located in the IR region. Using more series coefficients one identifies the u = 2 as the first IR renormalon.
As we have seen, the Borel transform of the Adler function has indeed an IR single pole at u = 2 and the next step towards improving precision would be to consider a P M N,3 (u) including such pole. In this case, and for the MS scheme, results improve since with a P 2 1,3 (u) one gets 30% of relative error on the c 5,1 determination, whereas 20% is reached with the P 1 2,3 (u). For the C = −5/3 scheme, the result is greatly improved since relative errors reach up to 1% and 5% respectively. In this scheme, even a Padé-type approximant with a fixed denominator at (u + 1) 2 (u − 2), a P M 0,3 (u), yields good results. In this sort of sequence, one can even study the convergence by looking systematically at the c 5,1 prediction for growing M . We find 35%, 7%, and 4% relative error for M = 1, 2, 3 respectively.
Unraveling the sub-subdominant renormalons is now more an art than a science since it is difficult to decide whether the second UV or the second IR should be considered. The decision comes from exploring systematically the residue of the predicted poles from previous approximants. We observe that they predict an IR (double) pole at around u = 3 with large residue. As such, it contributes largely to the series coefficients. The next step will be then to consider P M N,5 (u) including this double pole where the polynomial T 5 (u) of Eq. (17) is constructed such as to reproduce the first five poles of the Borel transformed Adler function We can then study near diagonal sequences akin to the ones we discussed in the previous section, e.g., P N +1 N,5 (u), P N N +1,5 (u), or P N N,5 (u). It is expected that these sequences should yield much better results since the perturbative series is dominated at intermediate and large orders by the poles closest to the origin.
We start with results for the sequence P N +1 N,5 (u) constructing the PPAs in the MS scheme. The relative error of the the coefficient c 5,1 obtained from P 2 1,5 (u) is 12 times smaller than the one from P 2 1 (u). The four coefficients used to fix the parameters of P 2 1,5 (u) provide enough information to obtain a good approximation even after asymptoticity has set in -the 10th coefficient is predicted with a relative error of only 20%. This reproduction of the series is achieved by generating a pole at 1.750. Results from P 3 2,5 (u) are so similar to the original function that they cannot be distinguished by eye. The approximant P 3 2,5 has a pair of complex poles at 2.854 ± 0.992 i which, again, appear due to the fact that the meromorphic function being approximated is not of the Stieltjes type.
The use of the scheme with C = −5/3 accelerates the convergence and now the results from P 2 1,5 (u) cannot be visually distinguished from the exact ones (we therefore refrain from showing them explicitly). Relative errors for the c 5,1 are of the order of 0.7% while for the tenth coefficient it is only about 2%. This excellent convergence is not unique for the P N +1 N,5 (u) sequence since for the subdiagonal one, P N N +1,5 (u), results are basically indistinguishable. This is so that even a Padé-type sequence P N 0,5 (u) converges nicely and the systematic prediction of the series coefficients with increasing power N yields a way to ascribe a theoretical or systematical error. P N 0,5 (u) with N = 2, 3 predict 797.8 and 792.3 respectively. Taking the difference among them as a way to estimate the error results into a prediction 792.3 ± 5.5 which nicely agrees with the true coefficient 787.8. Other results obtained from these approximants for FOPT and CIPT, as well as the Borel sum are as impressive as those for the Adler function and we refrain from quoting them explicitly.
At this point, two comments are in order. If, instead of considering double poles for the second IR pole and the leading UV pole in Eq. (37), we use simple poles the results are worsened. The coefficient c 5,1 obtained from P 2 1,3 (u) using T 3 (u) = (u + 1)(u − 2)(u − 3), changes from 847.9 to 594.1, to be compared with the exact value 787.8. The reason for this worsening is simple: fixed poles with wrong exponent forces the approximant to spend series coefficients to determine the correct exponents, with an associated loss of prediction power. This fact is nicely illustrated by P 2 1,3 (u) in the MS, which has (u + 1)(u − 2)(u − 3)(u − 3.06) as denominator, showing clearly that the Padé must reproduce the double-pole nature of the second IR renormalon. A scheme variation helps again in this issue since P 2 1,3 (u) for C = −5/3 has as a denominator (u + 1.04)(u + 1)(u − 2)(u − 3), with the extra pole very close to the first (and dominant) UV pole, emulating its double pole nature.
This shows that simply fixing poles at the correct location but with the wrong multiplicity does not represent necessarily an improvement over the situation where the poles are left free. An intermediate case where only the pole at u = 3 has an incorrect exponent does yield improved results compared to the ordinary Padé approximants. This is in agreement with the findings of Ref. [25] where it was shown that imposing the correct structure of the first two leading poles is sufficient to achieve a good description of the large-β 0 Adler function. Actually, in the language of Padé approximants employed here, both the "reference model" and the "alternative model" of Ref. [25] can be thought of as a P 6 0,5 (u), i.e., with full denominator fixed in advanced. The main observation that can be drawn here is that having information about the first two or three renormalon poles is largely sufficient to achieve an excellent reconstruction of the series even with only four coefficients available (this conclusion is in agreement with Ref. [25]). Notice that even though we use large denominators, we still allow the approximants to have free poles. This helps them to improve on the convergence as free poles accommodate in an effective way the rest of the renormalon contributions.

D-log Padé approximants
While information on pole positions yields a clear improvement in the acceleration of convergence, the precise knowledge on pole positions can be, in realistic situations, scarce. Moreover, knowing the multiplicity of poles is also important. Additionally, some functions present branch point singularities, which may render their approximation by Padés less effective. As discussed in the Introduction, in such situations, other strategies may be pursuit. In particular, the D-log Padé approximants of Eqs. (18) and (19) and their extension in Eqs. (20) and (21) can result optimal.
The original philosophy of the simplest D-log Padé approximant is to perform a PA not to f (z) but to F (z) = d dz log[f (z)]. Assuming the original function has a singularity at z = µ (a pole or a branch cut) the evaluation of the outcome at z = µ provides a way to extract, from the series coefficients, the multiplicity γ of the singularity at µ. Here, if γ is an integer or not becomes irrelevant which makes the D-log Padés particularly interesting to approximate functions with branch cuts. Of course, by unfolding the procedure one obtains the non-rational approximant Dlog M N (z) of Eq. (19) to the function f (z) which, after reexpansion, returns the series coefficients.
The simplest D-log approximant for the Borel transform of the Adler function, Dlog 1 0 (u), requires c 2,1 and c 3,1 and reads: After reexpanding, Dlog 1 0 (u) predicts both c 4,1 and higher coefficients and it does it rather accurately, taking into account the little information that was used. Still, no sign-alternation is observed. Going up to Dlog 2 0 (u), the prediction improves a lot and not only the sign-alternation is now reproduced, but also the relative error for the c 5,1 is around 40% (better for the next c 6,1 which is just 10% off). At the same order, a Dlog 1 1 (u) yields even better results, being only 16% off for c 5,1 , with the correct sign-alternation and a good prediction for c 6,1 as well, as can be seen in Fig. 6. Using this simplest scenario, with minimum information, the Dlog 1 1 (u) is the approximant that better predicts c 5,1 among the approximants presented in this work so far.
Information on known singularities can be straightforwardly used with the partial D-log Padés, Dlog M N (z; µ), of Eq. (20). The singularity closest to the origin in the Borel transform Adler function is the UV pole at u = −1. We then consider a PA to (1 + u) d du log[f (u)] instead, and construct the respective Dlog approximant to f (u). Even with the simplest Dlog 1 0 (u; −1) the sign-alternation of the series coefficient is recovered, a clear indication of an improvement on the series' reconstruction. The unfolded approximant reads which has a branch point at u = −1 but with a good prediction on the actual multiplicity of the first UV renormalon (which is a double pole in large-β 0 ). For the next order, two choices can be considered, the diagonal Dlog 1 1 (u; −1) and the Dlog 2 0 (u; −1). As it is clear from the definition in Eq. (20), F (z) is to a good approximation, a rational function. Then, the diagonal Dlog 1 1 (u; −1) is the optimal choice, specially if one is heading towards determining the multiplicity γ. In this case, the unfolded approximant reads and, upon expansion, predicts the series coefficients for the Borel transform with unprecedented precision for coefficients up to c 10,1 , with relative errors amounting to  mere 0.8%, 6%, 2%, 3%, 2%, 1%, 1% for the coefficients c 4,1 to c 10,1 in that order. The excellent results for this approximant can also be seen in Fig. 6. A key point in this extraordinary success is the excellent reproduction with very little information of the multiplicity of the first UV pole, as can be seen by the fact that γ = 1.96 for the µ = −1 in Eq. (40), while the other branch cut effectively emulates the tower of IR poles in the Borel transform. The result is so good that including c 4,1 to construct the Dlog 1 2 (u; −1) yields only a marginal improvement (predicting c 5,1 with a 5% of relative error, for example).
It is then clear the D-log Padés are able to go much beyond ordinary PAs in reproducing the main analytical features of the Borel transformed Adler function in large-β 0 . The success of this approximants can be understood by examining the function This shows that in the D-log Padés the function being approximated is strictly a rational function, with simple poles. The exponential function present in the Borel transform disappears and the approximants do not have to reproduce it. In a sense, the D-logs strategy realizes a situation similar to that of a scheme with C = −5/3 which cancels the exponential of the Borel transform but with the additional benefit that F (z) has only simple poles. Their success is, therefore, not so surprising. We can also expect that sequences of PAs that depart significantly from the diagonal will have a slower convergence to F (u) and will, accordingly, lead to a worse approximation of B[ D](u).
Finally, the inclusion of information about the position of more than one pole can be done in the context of D-log Padés by using a partial Padé approximant to F (z), with as many poles fixed as one desires. Using in the denominator of such a PPA T 7 (u) = (2 − u)(3 − u) 2 (1 + u) 2 (3 + u) 2 , and with c 2,1 , c 3,1 and c 4,1 as input one predicts the series's coefficients with precision better than 0.1%.
In conclusion, the D-log Padés are shown to be an effective way to improve the convergence of the approximation and gain information about higher orders. In some sense, they are better than the scheme transformations since they do not rely on a specific value of C. We turn now to the use of the techniques described here for the function δ

Padé approximants to δ (0)
We learn from the application of Padé approximants to the Borel transform of the Adler function that one of the difficulties is the disentanglement of the leading renormalons. The success of the use of scheme variations lies partially in this fact, since the method allows for an enhancement of the leading UV pole with respect to the leading IR pole at u = 2. The Borel transform of δ (0) , Eq. (23), on the other hand, does not have the pole at u = 2. The leading UV renormalon is therefore more isolated from the IR ones. It can be expected that the use of Padé approximants directly to this Borel transform should yield better and more stable results than in the case of the Adler function. In this section we exploit this route. Since general properties of the Padé approximants were discussed in the previous section, we will focus here on the practical problem of forecasting the unknown coefficients given that only the first four are known exactlyin order to simulate the case of real QCD.
Let us note that a rational approximant to δ (0) contains enough information to allow for a full reconstruction of the Adler function. The coefficients c n,1 can easily be read off from the FOPT expansion of δ (0) as Additionally, in large-β 0 , Eq. (23) can be used to extract the Borel transformed Adler function from that of δ (0) . We start here by applying Padé approximants directly to the series in α s /π, given by Eq. (25). As we have observed, the FOPT series in large-β 0 is rather well behaved and, at intermediate orders, its asymptotic nature is not visible yet. This is mapped into a simpler analytic structure in the Borel plane. It is therefore likely that in this case the approximation of the series by Padé approximants in a Q will lead to a better description than in the case of the Adler function. In Tab. 2, we display the Adler function coefficients that are obtained from the application of Padé approximants directly to the FOPT expansion of δ (0) . To simulate the situation of real QCD, in the first two rows, we attempt to forecast c 4,1 , while in the last three we use c 4,1 as input. The main Table 2: Adler function coefficients extracted from PAs P N M (a Q ) to the FOPT α s /π expansion of δ (0) in the large-β 0 limit. observation is that the results for the coefficients that are not used as input are quite good and stable. The sign alternation of the series is correctly predicted by all the Padés and good consistency between the results using three and four coefficients as input is achieved. The results are therefore quite remarkable. Let us examine one of the PAs in more detail. For P 2 1 , which uses only the first three coefficients as input, we find First, we see that the series obtained from such a Padé is convergent (since a τ ≈ 0.1) and does not reproduce the factorial growth of the coefficients. The pole that appears around a Q ≈ 0.18 is, however, rather stable -it does not seem to be transient in nature and is present in all the approximants to δ FO,Lβ . It may be worth noting that the pole is in the IR and corresponds to a scale of ∼ 650 MeV. It may, therefore, be related to IR physics but we refrain from further speculation regarding the nature of this pole. An estimate for the sum of the series can be obtained simply using a τ = 0.316/π in Eq. (43) to give P 2 1 (a τ ) = 0.2198. This result is also in very good agreement with the value obtained from the Borel integral of the exact large-β 0 result, which gives 0.2208 ± 0.0039. Even better results can be obtained from the PAs that use four coefficients as input, as shown in the last three rows of Tab. 2.
We now turn to PAs to the Borel transformed δ (0) . The results are improved when compared with PA to B[ D], although the improvement is far from spectacular. For c 5,1 we find the values 263.9, 603.6, and 1024 from P 1 2 , P 0 3 , and P 2 1 respectively, to be compared with 52.33, 560.9, and 1770.1 from the same PAs to B[ D]. Clearly, one must resort to the other methods discussed in the previous section in order to accelerate the convergence. Again, D-Log Padés turn out to be the optimal way to improve the convergence while remaining completely model independent. Their success can be understood from the a study of the function F (u) = d du log B[δ (0) ](u) , as is in the case of the Adler function. Here we find, retaining only the first term in the sum of Eq. (22),  The leading analytic structure of F (u) is now even simpler than for the Adler function. The poles at u = 0, u = 1, and u = 2 are exactly cancelled by the presence of π cot(πu) leaving only a leading UV pole at u = −1, an IR pole at u = 3 and a subleading IR pole at u = 4. 14 It is therefore expected that the D-log Padés should perform even better in the present case. We present results for the D-Log Padés applied to B[δ (0) ] in Tab. 3. These results also represent an improvement when compared to those of Sec. 4.1.4. The predictions for c 5,1 have a much smaller relative error, a factor of 2 to 5 times smaller than those obtained from the PAs to the Borel transformed Adler function. The sign alternation is well reproduced by the Padés with four coefficients used as input and their Borel integral provide excellent estimates for the true value of the series (we find, e.g., 0.2199 from DLog 1 1 (u)). However, one must note that the results from the D-Log Padés applied to B[δ (0) ] are less good than those of Tab. 2. For example, the coefficient c 4,1 is wrong by a factor of about two while in Tab. 2 it is only a few percent off. Nevertheless, the description of the Borel transformed δ (0) by D-Log Padés has the advantage that the factorial growth of the coefficients is reproduced and an asymptotic series is obtained, in line with the exact result.
We have checked that the results discussed in this section can be further improved by using information about the renormalons. For example, imposing the existence of the leading UV pole at u = −1 through the use of Dlog 1 1 (u; −1) leads to an almost exact reproduction of the series. We prefer, however, to remain as model independent as possible and we chose to focus here on the results obtained from the most model independent methods (PAs and D-log Padés). By using δ (0) and its Borel transformed, these model-independent methods lead to results as good as those obtained from the Adler function imposing information on the renormalons.
We close this section with a visual account of the results discussed here. In Fig. 7(a), we display the δ (0) FOPT and CIPT expansions obtained from the approximant P 2 2 (a Q ), for which we show the coefficients up to c 9,1 in the 5th row of Tab. 2. The main observation is that, up to the 9th order, the result is strikingly similar to the exact ones (see, e.g., the black lines in Figs. 6(b) and 6(c)). However, the FOPT series thus obtained is convergent, and after the 8th order its result stabilizes around the sum of the series, which cannot happen in the exact results. This does not prevent the description  In particular, the fact that FOPT is the best approximation in large-β 0 is well reproduced. In Fig. 7(b), we display results for Dlog 1 1 (u), for which the coefficients are given in the 5th row of Tab. 3. Again, visually, the results are almost identical to the exact ones. Now, both series are divergent, as is the case in the exact results, and this feature shows up after the 9th order, although the divergence, here, is more pronounced than in the exact large-β 0 case. In both cases, however, the approximants provide a good estimate for the first few unknown coefficients and give an excellent account of the series even though no information about the renormalons was used, in contrast with some of the results of Fig. 6(b) and 6(c)).

Summary and discussion
Here we summarize the main conclusions that can be drawn from the application of Padé theory to the results in the large-β 0 limit of QCD.
The application of the Padé approximants to the MS Borel transformed Adler function in the large-β 0 limit displays convergence. With six or seven coefficients one is able to reproduce very well all the essential aspects of the original series: its higher order coefficients, its Borel sum, and even the position of the dominant renormalon poles. There are, however, outliers and it can happen that the next Padé in a given sequence does not make the results better. The existence of these outliers is well understood in the theory of Padés and we have been able to show for specific examples why this happens. All in all, convergence is always observed once an even larger number of parameters is added. Results using only four coefficients of the series to construct the Padés -which corresponds to the number of available coefficients in QCD -are, however, far from spectacular. In the absence of more coefficients, one must exploit strategies to improve the approximation of the original function.
In the case of the large-β 0 , we have shown that using less perturbative schemes, which with our conventions corresponds to C < 0, one is able to construct better approximants with the same number of parameters. This improvement is due to the fact that the Borel transform in these schemes becomes largely dominated by the leading UV pole: one observes the sign alternation earlier and the Padés can easily reproduce the main features of the analytical structure of the series. The specific choice C = −5/3 has the additional advantage of removing the term e C+5/3 from the Borel transform, which makes it a strict rational function. As a by-product, we are able to devise a strategy to reveal the effects of the dominant UV pole. Constructing the series for C < 0 the sign-alternation of the coefficients sets in much earlier, already from the second coefficient.
Next, we have seen that the use of partial Padé approximants, where the available information about the renormalon poles can be exploited, leads to significant improvements. However, it is important to use this information with care, since imposing the existence of a pole but with wrong multiplicity forces the Padé to "spend" series coefficients fixing the analytic structure of the given pole. If the information of the leading UV pole is used correctly, this leads to a significant improvement on the results. Finally, we find that imposing the structure of the first two leading renormalons leads to an almost perfect reproduction of the Adler function. This is in line with the renormalon models of Ref. [25].
We then investigated the use of D-log Padés. These are an interesting alternative having in mind applications to QCD, since they do not require that the function be meromorphic. Their application in large-β 0 was very successful and we can safely conclude that, given the limited information available, they are the best way to accelerate convergence of the procedure. We highlight the use of partial D-log Padés where only the existence of first UV pole is used as input. The results of such partial D-logs are truly impressive and lead to an almost exact reproduction of the Adler function in large-β 0 with an almost model-independent method. We were able to explain their success in terms of the analytic structure of the Borel transformed Adler function.
We then turned to approximants constructed to δ (0) . First, we have shown that PAs to the FOPT series in α s /π, contrary to the case of the Adler function, do lead to a very good reconstruction of the series at intermediate orders. In the case of δ (0) , since the FOPT series is rather well behaved and regular up to the 10th order (with terms that are consistently smaller than their predecessor) the approximation obtained from these Padés is very reasonable. We have then shown that good results can also be obtained from D-log Padé approximants constructed to B[δ (0) ]. They are completely model independent and have the advantage that the factorial growth of the coefficients is automatically implemented. Both methods lead to good predictions for the true value of the series and are sufficient to conclude that FOPT is the best prescription in large-β 0 . Again, the success of the method can be explained by the simpler analytic structure of the Borel transformed δ (0) since it does not have the pole at u = 2 and all other poles become simple (with the exception of the ones at u = 3 and u = 4).
The systematic study performed in the large-β 0 limit leads to strategies for impressive determinations of the higher order coefficients and the Borel sum of the Adler function and δ (0) series. We were able to find methods to unravel dominant and subdominant poles, as well as to reorganize the available information in order to optimize the approximation by PAs and its variants. With these strategies at hand we can now perform a similar analysis in full QCD and present our final results.

The QCD case
In this section we will apply the techniques developed and tested in large-β 0 to the real case of QCD. Let us first remind what is known in this case. In perturbation theory, the Adler function expansion is known to five loops, hence to order α 4 s [21]. We would like to obtain predictions for the coefficients c 5,1 and higher. The first four coefficients of the Adler function in QCD, displayed in Eq. (8), already show a significant deviation from the large-β 0 results, although the coefficients of the δ (0) FOPT expansion, Eq. (14), are closer to the their large-β 0 counterparts. The differences between full QCD and the large-β 0 limit also show up in the Borel transform. In QCD, we know that the Borel transform is no longer a meromorfic function. Although the renormalon singularities remain at the same position, anomalous dimensions of the operators and higher-order β-function coefficients now change their nature from poles to branch points. What is more, at every branch point there are confluent singularities [18,24]. Experience shows that Padé approximants can be safely employed to approximate functions with branch cuts, but we no longer have convergence theorems to exploit, without further knowledge on the properties of these branch cuts.
Let us start with the simplest approximants: PAs constructed directly to the α s /π expansion of the Adler function. We have discussed that in large-β 0 this was not the optimal strategy. Given that we have now only the first four coefficients, we start by building the PAs that would "postdict" the last known term of the series, c 4,1 . From the approximants P 2 1 (a Q ) and P 1 2 (a Q ) we obtain the five-loop coefficients with 51% and 67% relative error, respectively. The coefficient c 5,1 is predicted to have quite low values 96.2 and 50.5. Our experience from large-β 0 already signals that this strategy is not optimal. If we construct approximants that now use c 4,1 as input we extract c 5,1 coefficients that can differ from the previously obtained by a factor of 9, indicating that the procedure is very unstable.
We then turn to the approximation of the Borel transformed Adler function, which proved to be more stable in large-β 0 . Again, we first try to obtain a "postdiction" of c 4,1 , using P 1 1 (u) and P 0 2 (u). The value of c 4,1 thus obtained has a relative error of about 26%, which is an improvement with respect to the previous method, and leads to predictions of c 5,1 ∼ 280. However, when we construct approximants that include the true value of c 4,1 as input, P 2 1 , P 1 2 , P 0 3 , there is no sign of stability. The predictions for c 5,1 and higher order coefficients change substantially which, from our experience in large-β 0 , signals that the approximants are not optimal. That only four coefficients is not enough for the usual Padé approximants to perform a stable prediction of the higher orders comes as no surprise, since this is what happens in large-β 0 even though the analytical structure of the Borel transform is simpler there. We must again resort to methods for the acceleration of the convergence of the procedure.
We start by investigating other renormalization schemes. Scheme variations in full QCD can also be performed using a single parameter C, through the generalization of Ref. [36] (to which we refer for further details). Based on the lessons from large-β 0 we rewrite the Adler function series in schemes with negative C, hence with larger values of the coupling. As an example, let us take again the value C = −5/3. The first four coefficients of the Adler function are now There are marked differences in the coefficients when compared with Eq. (33). In contrast to the large-β 0 case, the series in QCD with negative C = −5/3 does not display a systematic sign alternation. It seems, therefore, that the UV singularity is less dominant than in large-β 0 and the pattern of signs indicate a more complicated interplay between the poles. Possibly, the IR poles give a larger contribution to the QCD series relatively to the large-β 0 case. This is in line with the results of the models of Refs. [18,25]. In these models, the IR poles give larger contributions at intermediate orders and the systematic sign alternation sets in only at the 11th order. Since we are not able to unequivocally suppress the contribution of the IR poles, the strategy of using scheme variations to optimize the use of Padé approximants does not turn out optimal in QCD (although it corroborates, in part, the results of Refs. [18,25]).
We resort then to the use of D-log Padés to obtain approximants to B[ D](u). Their use in full QCD is also well motivated, since they are designed for functions that have branch cuts. The use of D-log Padés to postdict c 4,1 leads to values with relative error of 54% [Dlog 1 0 (u)] and 21% [Dlog 0 1 (u)] which again signals that the procedure is not optimal. The use of an additional coefficient as input does not change this picture significantly, since it leads to unstable results for higher-order coefficients, which is understandable again due to the presence of confluent singularities.
Based on our experience from the large-β 0 limit, one is the led to conclude that model-independent approximants constructed to the Borel transformed Adler function are not robust enough in QCD with only the first four coefficients available. In large-β 0 the knowledge about renormalon singularities could be used to optimize the predictions. Here, however, we face additional difficulties. First, the available knowledge is more scarce. Only the first few renormalons (the leading UV, and the two leading IR) have had their branch cut structure investigated in detail [18,55]. Second, the fact that now the renormalons become confluent singularities renders much more difficult the use of the available knowledge to devise optimized approximants. And, finally, it would be desirable to remain as model independent as possible. We therefore turn directly to the most successful model-independent strategy devised in large-β 0 : the use of the FOPT series for δ (0) .
In large-β 0 approximants constructed to δ FO and B[δ (0) ] resulted optimal. The perturbative series for δ (0) FO in large-β 0 and in QCD have rather similar coefficients. This means that the regularity of the series is preserved in QCD, which suggests that it can be well approximated by Padé approximants constructed directly to the series in α s /π. Furthermore, although Eq. (23) is strictly valid only in large-β 0 , because it relies on the one-loop running of the coupling, modifications to this result would be solely due to higher-orders in the α s evolution. We can therefore expect that a suppression of the leading IR singularity at u = 2, as well as a suppression of all the other renormalons except for the ones at u = 3 and u = 4, would survive in full QCD and render this Borel transform more amenable to approximation by rational functions.
We start with Padé approximants applied to the α s /π expansion of δ (0) FO . As before, we begin with a post-diction of c 4,1 using P 1 2 (a Q ) and P 2 1 (a Q ). The results for six higher-order coefficients obtained from these approximants are shown in Tab. 4. The relative error from the central values of c 4,1 is now ∼ 13%. This is quite remarkable when put into perspective since before the true value of c 4,1 was computed, a forecast of this coefficient using other methods and including additional information (taking into account known terms of order α 4 s N 3 f and α 4 s N 2 f ) yielded c 4,1 = 27 ± 16 [52][53][54], a central Inspecting the Padé approximants of the first two rows of Tab. 4 they reveal a pole around a Q = 0.1973, of similar nature to the one found in large-β 0 . Additionally, P 1 2 has a pole far from the origin at a Q = 7.25. This makes their expansion around a Q = 0 similar and their predictions turn out to be almost degenerate. Therefore, a stronger test for stability comes with the use of c 4,1 as input, to obtain c 5,1 and higher. One could construct, for example, the approximant P 2 2 . However, this approximant has a defect, in the sense discussed in Sec. 4.1.1, a pole and a zero that cancel almost exactly, effectively reducing the order of the approximant. Although its results are not completely inconsistent (e.g., c 5,1 turns out to be 242) we have shown that it is wise to avoid approximants of this type and we will discard P 2 2 . We turn then to the results obtained for P 3 1 and P 1 3 which are also shown in Tab. 4. Now, the forecasts of c 5,1 are 304.7 and 301.3 respectively. We can note a striking stability of the results for c 5,1 and c 6,1 ; even c 7,1 and c 8,1 are remarkably similar in all of the four approximants considered. The use of the PAs to sum the asymptotic series also leads to consistent result in all cases, as can be seen in the last column of Tab. 4. Our experience from large-β 0 indicates that this stability and the good prediction of c 4,1 strongly corroborate the robustness of the results. We have checked that the use of D-log Padé approximants is also very successful. We are, therefore, in a position to conclude that using PAs to δ (0) FO in QCD is at least as stable as in large-β 0 . We should then investigate the approximants constructed to its Borel transformed.
As in the previous section, the quality of the forecast of c 4,1 as well as stability arguments lead us to conclude that the D-log Padés are the optimal approximants to B[δ (0) ](u). Higher-order coefficients obtained from D-log Padés constructed to B[δ (0) ](u) in QCD are shown in Tab. 5. Now, the postdiction of the last known coefficient, c 4,1 , has a relative error of only about ∼ 6%, about half of what was obtained with Padés to the series in α s . Also, the stability of the results when using the exact value of c 4,1 as input is quite remarkable. The results for c 5,1 and c 6,1 are rather stable not only among the D-log Padés of Tab. 5 but also when compared with the results of Tab. 4. The approximant, Dlog 1 1 , not shown in Tab. 5, leads to slightly lower values for the coefficients (e.g., c 5,1 = 237), but even these apparent instability can again be understood in terms of a partial cancelation between a pole and a zero present in the P 1 1 used for its construction. We, therefore, consistently discard this approximant. It is also interesting to observe that all the D-log Padés of Tab. 5 predict that the sign alternation of the series starts at order 11. This is in agreement with the speculation we advanced, based on scheme variations, that the UV singularity in QCD is less prominent which should postpone the sign alternation with respect to large-β 0 where it sets in 0.2019 from c 6,1 on. Finally, the Borel sum of the series obtained from these D-log Padés is also very consistent (last column of Tab. 5).
The picture that emerges from the results of this section is that the use of δ FO and its Borel transform lead to the best model-independent approximants in QCD -as is the case in large-β 0 . The quality of the predictions of c 4,1 as well as the stability of the results among different approximants signal that we have managed to obtain a robust description of δ (0) and of the Adler function at higher orders. In the next section, we extract our final results and perform error estimates.

Final results and error estimates
In producing our final results, we will try to remain as conservative as possible. We extract our final estimates for the higher-order coefficients from the eight approximants of Tabs. 4 and 5 including, thus, those that have only three coefficients as input parameters. By doing so, we take advantage of Padés that belong to different sequences and can obtain a more reliable error estimate for our final coefficients. Since one of the most striking features of these results is their stability, we will not try to favour one approximant over another, even though one could try to inspect their analytic structure in detail with this goal in mind. Our final estimate of the coefficients and of the true value of δ (0) is obtained as the average of the eight results of Tabs. 4 and 5. To these averages we add an error equal to the maximum spread found between the coefficients obtained from two different approximants. This error should certainly not be interpreted in a statistical sense; it gives an interval where the value of the coefficient is expected to lie.
This procedure applied to the six-loop coefficient, c 5,1 , leads to which largely covers all the results obtained from our optimal approximants. Therefore, in a sense, our error estimate could even be considered as too conservative -even if much smaller than other estimates in the literature. For example, in Ref. [18] the estimate c 5,1 = 283 ± 142 is used, while in Ref. [52] one finds c 5,1 = 145 ± 100 (using only partial information about the five-loop coefficient). The value obtained from the principle of Fastest Apparent Convergence (FAC) in Ref. [21] is c 5,1 = 275, remarkably close to our final central value, given in Eq. (46). On the basis of what we know about the series coefficients, it seems extremely unlikely that the six-loop coefficient would not be within these bounds. c 9,1 c 10,1 c 11,1 c 12,1 (1.6 ± 1.4) × 10 6 (6.6 ± 3.2) × 10 7 (−5 ± 57) × 10 7 (2.1 ± 1.5) × 10 10 Results for coefficients c 6,1 and higher are given in Tab. 6. The final values for the Adler function coefficients are extracted with reasonable errors up to c 10,1 . One should remark that due to the α s suppression at these higher orders, an error that seems large in the coefficient does not translate into a very large uncertainty in the sum of the series. The situation changes only for c 11,1 . For this coefficient, six of the PAs of Tabs. 4 and 5 predict that the sign alternation sets in. However, two of the approximants do not, which leads to the huge error. Therefore, we find some indication that the sign alternation of the Adler function coefficients sets in at the eleventh order (in agreement with [18]). This agrees with our expectation that the UV singularity in QCD should be less dominant than in large-β 0 . This instability signals that our results cease to be fully reliable at the 11th order.
We apply the same procedure described above to obtain an estimate for the true value of the δ (0) using the results in the last columns of Tabs. 4 and 5. Using α s (m 2 τ ) = 0.316 ± 0.010 [49], this leads to δ (0) = 0.2050 ± 0.0067 ± 0.0130, (47) where the first error is the estimate from the spread of the PAs and the second error is due to the uncertainty in α s . This result can be compared with the Borel model of Ref. [18] which gives (with α s (m 2 τ ) = 0.316 ± 0.010) δ (0) BM = 0.2047 ± 0.0029 ± 0.0130 (from Ref. [18]), (48) and with the estimate based on the optimal C-scheme δ (0) C = 0.2047 ± 0.0034 ± 0.0133 (from Ref. [36]).
In Eq. (49) the first uncertainty is due to the truncation of the asymptotic series, while in Eq. (48) it arises from variations of the input value of the coefficient c 5,1 within their assumptions. In all cases the second uncertainty stems from α s . The striking similarity of these three results is very remarkable, since they are obtained from independent methods. In our case, the known series coefficients are the only input used to construct PAs and predict the behaviour of the series at higher orders. In the case of Ref. [18], the Adler function is modelled using the available knowledge about the leading renormalon singularities, while the result from the C-scheme is based on a renormalization scheme choice which uses optimally the available information in the spirit of an asymptotic series. The latter can also be considered as model independent since no assumption about higher orders or the renormalon structure of the Adler function is made. More is obtained, where the first error is again from higher orders and the second from α s . This result is also in very good agreement with ours. In Ref. [57], through the use C-scheme variations together with an optimal conformal mapping (CM) that relies on the location of the leading renormalons the value δ (0) CM = 0.2018 ± 0.0211 ± 0.0123 (from Ref. [57]) is found, in which the first uncertainty is from the truncation of the asymptotic series and the second from the strong coupling. This result is also nicely compatible with the others quoted in this section. With the coefficients of Tab. 6 we are finally in a position to plot, in Fig. 8, the perturbative expansions of δ (0) and compare them with the true value of the series obtained from Eq. (47). The bands in the perturbative expansions of Fig. 8 represent the uncertainty from the series coefficients, given in Tab. 6, while the band in the Borel sum of the series is the first error Eq. (47). The uncertainties we are able to obtain from the optimal Padé approximants allow us to conclude that FOPT is the favored renormalization-scale setting procedure in the case of full QCD. The CIPT series, even though it looks more stable around the fourth order, does not approach well the central value of the sum of the series. The recommendation that FOPT is the best procedure in QCD was advocated in Ref. [18] in the renormalon-model context. Here it is reobtained in a model-independent way.

Conclusions
In this work we have performed a systematic study of the Adler function and of the perturbative QCD correction to the hadronic τ decay width, δ (0) , using Padé approximants and its variants. We have used the large-β 0 limit of QCD, where the series are known to all orders in α s , as a laboratory to test our strategy. We were able to show that the method always works provided a large enough number of coefficients is known. Since in QCD only the first four have been calculated exactly, we have devised strategies with the aim of accelerating the convergence of the approximants. The success of these strategies can be understood in terms of the analytic structure of the Borel transformed series. The model independent acceleration methods simplify this structure either by suppressing the residue of some poles or by reducing their multiplicity.
A similar suppression of the poles is also found in the Borel transform of δ (0) which automatically leads to a more regular series that is more amenable to approximation by PAs. We have exploited this fact to show that, in large-β 0 , the PAs formed to the α s expansion of δ (0) and the D-log PAs constructed to its Borel transform B[δ (0) ] are the optimal model-independent way of extracting the higher-order coefficients of the Adler function.
We have then applied the same procedure to full QCD. The excellent "postdiction" of the coefficient c 4,1 , which is known since 2008 [21], as well as the striking stability of the results gives us confidence that the method also works in QCD. From PAs to the fixed-order expansion of δ (0) and D-log PAs to B[δ (0) ] we extract the final results of this paper, given in Tab. 6 and Eq. (47). These results allow us to reconstruct very reliably the perturbative expansions of δ (0) up to at least the tenth order. Finally, Fig. 8 shows that our model-independent reconstruction of the higher-order series coefficients favours FOPT as the best procedure to set the renormalization scale at and around the τ mass.
We should remark that our final results are similar to the model-dependent reconstruction of the series put forward in Ref. [18] and further discussed in Ref. [25]. This lends support to renormalon model used in these works, even though the use of PAs show that one should be careful when interpreting the parameters of such models. In the case of the renormalon models, the analytic structure is completely fixed, the only freedom is left to the residues. Therefore, we should expect that these are "effective residues", in the spirit of Padé-type approximants, and can only be compared with the true residues in a hierarchical way, since they must account for the infinite tower of poles that must be mimicked by the model.
Apart from providing reliable estimates for the higher-orders coefficients and indicating that FOPT is preferred, our results could be the basis for an α s extraction based on the Borel sum of δ (0) . The fact that the results of Eqs. (47)(48)(49)(50)(51) are so close suggest that it may be realistic to do so. We also intend to investigate further the analytic structure of B[δ (0) ] in QCD, since the non-trivial result of Eq. (23) plays a central role in our analysis. The simplicity and flexibility of the method here developed suggests it could also be used to further explore non-perturbative contributions in the context of α s determinations.