Fully-differential Top-Pair Production at a Lepton Collider: From Threshold to Continuum

We present an approach to predict exclusive $W^+bW^-\bar{b}$ production at lepton colliders that correctly describes the top-anti-top threshold as well as the continuum region. We incorporate $t\bar{t}$ form factors for the NLL threshold resummation derived in NRQCD into a factorized relativistic cross section using an extended double-pole approximation, which accounts for fixed-order QCD corrections to the top decays at NLO. This is combined with the full fixed-order QCD result at NLO for $W^+bW^-\bar{b}$ production to obtain predictions that are not only valid at threshold but smoothly transition to the continuum region. Our implementation is based on the Monte Carlo event generator WHIZARD and the code TOPPIK and allows to compute fully-differential threshold-resummed cross sections including the interference with non-resonant background processes. For the first time it is now possible to systematically study general differential observables at future lepton colliders involving the decay products of the top quarks at energies close to the pair production threshold and beyond.


Introduction
The top quark is the heaviest elementary particle in the Standard Model and thus provides a unique window to new physics models in which couplings are related to mass. Top quark physics is therefore one of the corner stones in the physics program of all future lepton colliders. One of the most important measurements is the scan of the top-anti-top resonance lineshape which will enable top mass and width measurements with unprecedented precision. The top pair production cross section near threshold and in the transition region to the continuum is also sensitive to the couplings of the top quark, like α s or the top Yukawa coupling. Reliable theory predictions in the threshold region crucially require the resummation of Coulomb singular (α s /v) n terms to all orders in perturbation theory, where here and throughout this paper v denotes the (effective) velocity of the top quarks in the centerof-mass (c.m.) frame. In the threshold region we have v ∼ α s ∼ 0.1, which requires a simultaneous expansion in both parameters. This indicates that bound-state effects become important despite the fact that the top quarks decay before they can form a would-be toponium state. Furthermore, in the threshold region the concept of on-shell top quark production loses its meaning and all kinematic configurations in the resonance region governed by the top width are equally important. We refer to this kinematic configuration as "resonant tt production". 1 The resummation of the singular velocity terms is performed within the effective theory nonrelativistic QCD (NRQCD) [1,2] by solving a Schrödingertype equation for the propagation of the top-anti-top system. Cross section calculations in the NRQCD framework are therefore always understood to include the resummation of the Coulomb singular terms to the stated order. In this context the "fixed-order" label is used to distinguish from a renormalization-group improved (RGI) calculation.
To be explicit concerning the nonrelativistic power counting in the threshold region, we write the normalized total cross section (R-Ratio) schematically as (α s ln v) i × 1 (LL); α s , v (NLL); α 2 s , α s v, v 2 (NNLL); . . . . (1.1) The overall factor of v emerges from the nonrelativistic phase space integration. In the respective fixed-order counting (i.e. N k LO instead of N k LL) powers of ln v are considered of order unity, so in the nonrelativistic power counting convention predictions at N k LL order fully include all N k LO contributions. So far, theory predictions for the tt threshold have mainly focused on the total production cross section. For the fixed-order NRQCD expansion the total cross section is known at NNLO for a long time [3,4], and since recently also to N 3 LO [5]. Besides the Coulomb singularities, close to threshold also logarithms of the velocity become sizable requiring the additional resummation of (α s ln v) n terms to all orders as indicated in Eq. (1.1). Modified versions of NRQCD, namely the potential NRQCD (pNRQCD) [6,7] and the velocity NRQCD (vNRQCD) [8][9][10] frameworks, allow the systematic resummation of these logarithmic terms via renormalization group (RG) equations, where both frameworks differ with respect to the separation of soft and ultrasoft modes. 2 Regarding RG improved predictions of tt threshold production the result of Ref. [11] is currently state-of-the-art combining NNLO fixed-order corrections with resummation of velocity logarithms at NNLL order. 3 For the threshold production of resonant top quarks the vector (S-wave) and axial-vector (P-wave) form factors differential in the top three momentum and the c.m. energy are available at NNLO and provided from the numerical code Toppik [13][14][15]. In the present work we will use Toppik in order to implement the Coulomb resummation in our calculation of off-shell top-pair production, as described below.
Once electroweak effects are included, the natural power counting scheme near threshold also accounts for the electromagnetic coupling α em and reads v ∼ α s ∼ √ α em ∼ 0.1.
Here the dominant effect is related to the top decay itself, which for the total inclusive cross section determined via the optical theorem enters the NRQCD prediction at LO simply via the shift √ s → √ s + iΓ t in the top propagators [16,17]. The width receives sizable QCD corrections, which must be taken into account at NLO. In addition there are non-resonant electroweak contributions to W + bW −b production, e.g. where at least one of the top propagators is non-resonant or absent. As was pointed out already in Ref. [18], these start contributing at NLO because instead of the overall nonrelativistic phase space factor v shown in Eq. (1.1) these contributions have at least one power of the electroweak coupling constant α em . In Refs. [18,19] the associated NNLO interference effects and a renormalization group method to resum large NLL phase space logarithms were provided 2 We will often refer to the nonrelativistic theory used to perform the threshold resummation as NRQCD in general. The results for the total cross section obtained in vNRQCD and pNRQCD agree at NLL. 3 A certain class of soft NNLL (mixing) terms starting at N 4 LO are still missing in this calculation. There is, however, good evidence that these are negligible compared to the fully known ultrasoft contributions and the residual uncertainty of the result [11,12].
accounting also for cuts on the top/anti-top invariant masses. The full set of non-resonant non-logarithmic corrections were given in the fixed-order expansion at NLO and NNLO in Ref. [20] and Ref. [21], respectively, (see also Refs. [22,23]) for the total W + bW −b cross section and including top/anti-top invariant mass cuts. The dominant purely hard electroweak corrections to resonant nonrelativistic top pair production are from QED initial state radiation enhanced by logarithms log(s/m 2 e ), which are treated in this work in the structure function approach. All remaining ones carry the overall nonrelativistic phase space factor v shown in Eq. (1.1) as well as at least one power of the electroweak coupling constant α em and therefore contribute only at NNLL level [24,25]. The impact of non-resonant contributions for measurements of the top quark mass was analyzed in Ref. [26] focusing on the effects of single top production which represents the most important non-resonant correction.
Adopting the nonrelativistic power counting, the full Born-level relativistic W + bW −b production cross section contains the complete non-resonant contributions starting at NLO. Its O(α s ) full QCD corrections contain the complete non-resonant contributions starting at NNLO. It should be noted that the above counting scheme accounting for electroweak effects applies in a small ∼ 10 GeV window around threshold, where the relation v ∼ α s 1 holds.
Above threshold, fixed-order full QCD corrections to the vector and axial-vector current contributions to (on-shell) tt production have been computed inclusively to N 3 LO [27,28] and differentially to NNLO [29,30]. The off-shell process (with W + bW −b as the final state), which is also defined below threshold, has been studied at NLO in Refs. [31][32][33]. Pure QCD fixed-order results without Coulomb resummation are, however, only reliable in the relativistic continuum sufficiently away from the tt threshold. In order to distinguish the relativistic QCD fixed-order power counting, which refers to powers of the strong coupling only, from the nonrelativistic power counting, from here on we always supplement the relativistic counting with the prefix "QCD". 4 Despite their sophistication the current results for the top pair production are not yet suitable to address a number of key issues relevant for a fully realistic assessments of the top quark related measurements at a future lepton collider. First of all, up to now a quantitative analysis addressing down to which c.m. energy threshold-resummed calculations can be neglected and pure fixed-order continuum results can be used reliably has been missing. Vice versa, it has not been shown how far one can trust a threshold-resummed prediction when increasing the c.m. energy away from the nonrelativistic limit. To approach these questions one can construct a matched computation for e + e − → W + bW −b that combines the threshold-resummed and the fixed-order continuum computations that is valid for all √ s and study its theoretical uncertainties. This is especially important for the proposed 380 GeV stage of the Compact Linear Collider (CLIC) proposal, where one can probe the threshold region due to photon radiation off the initial state, but also for the 350 GeV staging of the International Linear Collider (ILC) and the newly devised 350 GeV staging of CLIC. The convolution of QCD threshold predictions with initial-state radiation and realistic beam spectra have been already studied in Refs. [34,35]. However, these results are only reliable as long as the QCD threshold predictions in the convolutions are employed in a small ∼ 10 GeV window around threshold, where the assumption v ∼ α s 1 holds. Depending on the shape of the beam spectrum, this assumption can be questioned, and a fully matched prediction is therefore also quite desirable.
The second issue concerns the fact that theoretical predictions that provide a full description of the W + bW −b final state and in addition account for the top pair threshold dynamics have not been provided up to now. Currently theoretical examinations are known for the total inclusive cross section [3-5, 10, 11, 15, 36], the inclusive top three-momentum distribution [13,15], the top quark polarization [37,38] and for the dependence of the total cross section on cuts on the invariant mass m bW of the b-W system arising from the top quark decay [19,20,23,39]. However, despite the fact that the tt measurements at threshold can be fairly inclusive, the physical final states are still the leptonic, semi-leptonic and hadronic decay products. Thus, any inclusive measurement suffers from systematic uncertainties that result from extrapolating the measured cross section in the fiducial phase space to the full phase space. Here it is also important to have a full theoretical control of other W + bW −b production mechanisms such as single top production, which are characterized by kinematical configurations that differ substantially from resonant tt production [26]. These types of systematic uncertainties have not been explored systematically and are not accounted for in current experimental [34,35,40] analyses. They can only be addressed coherently by theoretical predictions that at the same time describe the full W + bW −b final state and account for the top pair threshold dynamics.
In this paper, we present an approach which allows to address the issues above by providing a matched computation for e + e − → W + bW −b correct to QCD-NLO and including threshold resummation with NLL precision. For observables that are inclusive concerning the top-anti-top double-resonant portion of the W + bW −b phase space we achieve QCD-NLO as well as NLL precision with respect to QCD effects. This applies to the total cross section, but also includes the total cross section with moderate acceptance cuts, e.g. related to the reconstructed invariant mass of the top quarks or the typical event selection procedures. To this end, we devise a master formula that matches the nonrelativistic computation in the threshold region with the relativistic fixed-order result in the continuum carefully avoiding any double counting of terms at QCD-NLO and NLL order. We also maintain all irreducible backgrounds of W + bW −b and the interference of resonant (i.e. involving a tt pair) and non-resonant (i.e. not involving a tt pair) W -b production in the threshold region, hereby reaching NNLO precision concerning these electroweak corrections in the threshold region at the differential level. We construct a manifestly gauge-invariant result by applying an extended double-pole approximation for the threshold-resummed top pair production form factors, which is also defined below the kinematical threshold. 5 To achieve a QCD-NLO description of the top quark decay for these threshold-resummed contributions, we include the corresponding QCD-NLO corrections in the on-shell approximation.
Concerning fully differential predictions in the threshold region our approach is limited concerning the treatment of real and virtual ultrasoft gluon radiation in the threshold region and concerning virtual potential-type longitudinal gluon exchange involving the final state b quarks. Both effects cancel at NLL order for the total inclusive cross section [42,43] and also when acceptance cuts are applied that do not resolve the top-anti-top doubleresonant portion of the W + bW −b phase space [19,20]. For many other types of distributions including top quark spin measurements, however, they give additional non-trivial NLL corrections which are typically at the level of up to 10%, see e.g. Ref. [44]. A coherent treatment of these NLL effects in the context of our matched setup is postponed to future work. Thus, at the fully differential level our current results are strictly valid at LL order in the threshold region. In any case, the first gluon emission at QCD-NLO order is by construction fully accounted for in our matched prediction for all kinematic regions, and we expect that a significant part of the NLL corrections for fully differential observables are therefore included. We also note that in the context of this work we do not account for the summation of phase space logarithms coming from top-anti-top unstable particle effects [18] and for Coulomb potential corrections due to photon exchange. The latter constitute the dominant electromagnetic corrections for the top-anti-top threshold dynamics contributing at NLL order and can be trivially implemented. The former come from phase space divergences in the nonrelativistic calculation and constitute NLL effects as well. Their treatment is also postponed to future work. With a completely differential and matched description for the threshold region at hand, many kinematic observables and distributions in the threshold region can be studied coherently for the first time. This may also allow to systematically access alternative methods to determine the top mass in the threshold region besides the paradigmatic totally inclusive energy scan. Furthermore, it is possible to study observables in the intermediate region between threshold and high-energy continuum. In this paper, we will discuss a small number of kinematic distributions at the exclusive level for the matched setup as a proof of principle, but defer specific phenomenological studies for top threshold measurements to future publications. Here, we concentrate on the presentation and validation of our method for matching fixed-order QCD and resummed threshold corrections.
We note that apart from subtracting terms related to double counting with respect to the QCD fixed-order and threshold-resummed computations, an essential ingredient for a fully matched computation that smoothly covers threshold, intermediate and continuum regions is that the terms resummed in the nonrelativistic part are switched off away from the threshold region. This is necessary since the resummed terms determined in the nonrelativistic expansion do not naturally transition into relativistic expressions. Since there is no unique way to implement the switch-off, the intermediate region between nonrelativistic and relativistic domains carries an additional theoretical uncertainty, which has to be estimated carefully. This uncertainty, however, decreases with the order of the nonrelativistic and relativistic expansions as long as both converge [45].
We also note that the computations presented in this work obviously do not have the -5 -highest precision currently available for the inclusive total cross section concerning the QCD corrections in the threshold region. On the other hand, our results are novel for the phenomenologically more realistic differential final states and especially in the intermediate region between threshold and continuum. It would be straightforward to augment our computations with K-factors such as K NNLL = σ NNLL /σ NLL to increase the precision for the inclusive total cross section. We also remark that there have been previous analyses that have implemented nonrelativistic resummations within relativistic calculations. In Refs. [46,47] NLL nonrelativistic resummations were embedded into a relativistic calculation of the Higgs energy spectrum for e + e − → ttH at NLO. These results were inclusive with respect to the top quark decays. In Ref. [48] nonrelativistic resummation form factors have been embedded into a factorized relativistic tree-level computation for e + e − → ttH → ¯ + X with the aim of examining the Higgs CP properties from the leptonic angular correlations. Finally, in Ref. [49], the authors incorporated tt threshold effects in a fully differential Monte Carlo (MC) simulation at hadron colliders which combined NLO threshold resummations with all resonant and non-resonant diagrams for W + bW −b production at tree-level by multiplying signal diagrams with nonrelativistic form factors. Their approach is similar in scope to ours as it provides a description in all kinematic regions. However, they accounted for QCD-NLO relativistic corrections only through K-factors in the tt invariant mass distribution (determined for on-shell top quarks) rather than by including the full set of QCD-NLO corrections for W + bW −b production. From a systematic point of view their work represents a consistent leading order treatment.
The outline of this paper is as follows. In Section 2, we review the NRQCD derivation of the S-and P-wave form factors that are the key ingredients in our implementation of the threshold resummation. The gauge-invariant embedding of these results within the relativistic setting in Whizard using an extended double-pole approximation for signal diagrams is discussed in Section 3. In Section 4, we verify that this implementation works as expected by comparing to known results. In Section 5 we discuss the necessary ingredients for the matching of the NLL resummations at threshold and the full QCD-NLO relativistic results for W + bW −b production, and we study inclusive and differential results in Section 6 and Section 7, respectively. We summarize our findings and give an outlook to further developments and opportunities in Section 8.

Threshold resummation
The basis of our approach to implement the threshold resummation in a MC event generator are the nonrelativistic S-and P-wave form factors. They describe resonant tt production through the vector and axial-vector currents in the presence of the bound state effects. The latter are related to the resummation of Coulomb singular (α s /v) n terms as well as the resummation of the velocity logarithms ∝ (α s ln v) n determined within the NRQCD framework. As the form factors are fully differential concerning the bound state dynamics, are by themselves gauge invariant and do not depend on issues related to the final state -6 -particles originating from the decays, they represent suitable building blocks for a MC implementation.
In this section we review some details on the derivation of these form factors at NLL order using the vNRQCD [8][9][10] framework. This forms the basis for the theoretical setup used for their implementation in Whizard. We also briefly review the standard calculation of the total inclusive threshold cross section using the optical theorem for later reference. For brevity, throughout this section we denote the top pole mass by m ≡ m t .

Form factors in vNRQCD
The effective field theory (EFT) vNRQCD involves soft, ultrasoft and potential degrees of freedom whose dynamical effects are correlated via the nonrelativistic heavy quark dispersion relation E ∼ p 2 /m which relates the ultrasoft and soft energy and momentum scales. The typical four-momenta (q 0 , q) of the soft, ultrasoft and potential degrees of freedom scale like (mv, mv) , (mv 2 , mv 2 ) and (mv 2 , mv), respectively. Hard fluctuations with momenta (q 0 , q) ∼ (m, m) and non-resonant modes, e.g. with momenta (q 0 , q) ∼ (mv, mv 2 ), are integrated out. The vNRQCD Lagrangian [8] reads is the (color-singlet) Coulomb potential and we only display the terms relevant to describe the dynamics of a heavy quark pair in the color singlet configuration at NLO. We have suppressed higher order terms in the v expansion as well as color indices. The vNRQCD field operators ψ p (x) and χ p (x) annihilate the heavy quark and anti-quark, respectively. The (discrete) label p denotes their soft three-momentum, while their position argument x (which is suppressed in Eq. (2.1)) is the Fourier conjugate to their ultrasoft four-momentum components. All momenta are defined in the center-of-mass frame of the heavy quark pair. At NLL (and NLO) the effects of soft interactions encoded in L soft can effectively be accounted for by adding a term [50] to the coefficient of the Coulomb potential operator in Eq. (2.1): Here and in the following, µ S denotes the soft renormalization scale. For unstable top quarks the inclusive effects of the decay width are taken into account by adding to the Lagrangian in Eq. (2.1), which is sufficient to determine the form factors at NLO. In the EFT, Γ t is formally an input variable and can be set to a measured value or to the width computed in the Standard Model (SM) at the desired order. In our MC implementation, the width terms contribute to the virtual unstable top and anti-top quark lines contained in the resummed diagrams, while we describe the top decay fully differentially as discussed in Section 3. For consistency it is therefore important that the approximation and the parameters used for the width Γ t are equivalent to those used for the differential computation of the decays. In other words, Γ t must agree with the total width computed with our program for the process t → W + b to obtain the correct normalization of the total cross section. Besides the operators in the vNRQCD Lagrangian we require nonrelativistic currents that produce the heavy quark pair. They couple to the virtual photon or Z boson in the schannel tt production process and are considered external from the EFT point of view. For the purpose of this paper we need the leading vector (S-wave) and axial-vector (P-wave) currents in the v expansion. They are defined by respectively.
We can now form the operator matrix elements where the first prefactor comes from the average over color and spin and where the soft momentum variables p and p are now continuous by including the residual (ultrasoft) three-momenta of the quark fields. The functionsG 1/3 in Eqs. (2.7) and (2.8) represent the S-and P-wave Green functions, respectively, of the Schrödinger equation for the quarkanti-quark system in momentum space [15,51]. They describe the propagation of a heavy S/P-wave quark-anti-quark bound state with nonrelativistic energy The relative three-momentum between the quarks is 2p in the initial and 2p in the final state. Integrating over p is equivalent to producing the heavy quark and anti-quark at the same space-time point (at zero distance), i.e. from a local current as illustrated in Figure 1. Accordingly we define the amputated S/P-wave vertex functions with the free top-anti-top propagator (i.e. without potential interactions, V c = 0) given by (2.12) The inverse of G f (E, k) in the definition of the amputated vertex functions removes the contribution of the resonant external legs of the diagrams in Figure 1, such thatS(E, p) = 1 + O(α s ) andP (E, p) = 1 + O(α s ), respectively. 6 The vertex functions in Eqs. (2.10) and (2.11) are key ingredients to our vector and axial-vector form factors and we will need them to NLO of their nonrelativistic expansion. Beyond that order, at NNLO and beyond, one also has to take into account subleading currents, e.g. describing D-wave or v 2 -suppressed S-wave effects. The vNRQCD equations of motions for the amputated vertex functions take the form of Lippmann-Schwinger equations [15]: (2.14) Solving these equations resums the contributions from the Coulomb interaction diagrams and the associated Coulomb singularities as illustrated in Figure 1 to all orders.  (2.14) can be separated. The three-dimensional integral equations thus effectively become one-dimensional, with an 6 Here we have introduced the notationS andP for the amputated vertex functions to distinguish them from the non-amputated vertex functions S and P of Ref. [15] that are defined without the factor of G f (E, k) −1 .
-9 -additional non-trivial angular integration remaining in the kernel, which can be performed analytically. For the numerical solution, the one-dimensional integral equations are then written in discretized form as sums over a suitably chosen grid of momenta. This transforms the two integral equations into two systems of linear equations which can be solved directly by matrix inversion methods. The manifest (but integrable) singularities at momenta p = k in Eqs. (2.13) and (2.14) are avoided by subtracting and adding backS(E, k), and respectivelyP (E, k), under the integral. The resulting additional integrals on the RHS can be calculated analytically, while the subtraction removes the manifest singularities in the integrand/summand. In Toppik, to improve efficiency and accuracy, momentum grids have been chosen based on Gauss-Legendre integration methods. To achieve the required accuracy ( 1%), a grid of typically 600 points is then sufficient. It should be noted that this method is extremely robust in the case of the S-wave, where, even in the case of the long-range Coulomb interaction, the convergence properties ofS(E, k) guarantee a finite integral of the momentum distribution |S(E, k)| 2 even for unphysically large, nonrelativistic momenta up to infinity. In contrast, the corresponding integral over the P-wave momentum distribution |P (E, k)| 2 is ill-defined without a cut-off. While the solution of Eq. (2.14) as described is still possible, the worse convergence behavior of the P-wave as compared to the S-wave leads to numerical instabilities and hence a somewhat limited accuracy of the method as implemented in TOPPIK. The problems arise from the fact that in order to achieve a good accuracy for the un-regularized vertex function, very large momenta must be sampled in the grid. These in turn lead to instabilities at very small momenta. In practice, as the behavior at small momenta is known, this problem can be fixed easily and the method remains powerful. As a cross check of the numerical solutions by Toppik and to estimate their accuracy, we have inserted the solutions into the RHSs of the Lippmann-Schwinger equations of Eqs. (2.13) and (2.14) and compared the result to the original results. We found that for our numerical P-wave vertex function, both sides of Eq. (2.14) agree within 1% for all relevant energy values. This numerical precision is sufficient for our purposes. For the S-wave the precision is better by roughly an order of magnitude.
In the vNRQCD framework the nonrelativistic vector and axial-vector current operators in Eqs. (2.5) and (2.6) are multiplied with the Wilson coefficients c 1 (h, ν) and c 3 (h, ν), respectively, which account for the summation of the velocity logarithms ∝ (α s ln v) n . They depend on the matching parameter h [11] and the vNRQCD renormalization parameter ν, the so-called subtraction velocity [8], see Section 2.4. The matching of the EFT currents to the full theory vector and axial-vector currents is performed at the (hard) matching scale µ H ≡ hm, i.e. at ν = 1 and for a choice of h close to unity. This gives at one loop These matching coefficients are by now in fact already known to O(α 3 s ) [52] and O(α 2 s ) [53], respectively. At LL order the anomalous dimensions of the currents vanish, such that their -10 -RG running at this order is trivial [8]: The NLL evolution of c 1 has been computed in Refs. [10,54]. Also the (dominant) NNLL contributions are known [55,56], see also Ref. [57]. The NLL result for c 3 can be found in Ref. [58]. For completeness we quote the NLL expressions for c 1,3 in Appendix A. Setting being the effective velocity of the heavy quarks, resums large velocity logarithms ∼ (α s ln v) n from the production process into the Wilson coefficients c 1,3 (h, ν) and the soft coupling α s (µ S ≡ hmν) in the vertex functions. 7 With the ingredients discussed before we can now define the vector and axial-vector form factors that we will use to implement the threshold (Coulomb and log) resummations in Whizard (see Section 3): The superscripts V , A stand for the vector (S-wave) and axial-vector (P-wave) current, respectively, and we have listed the renormalization scaling parameters h and ν as arguments to be explicit. 8 We stress that the only kinematic variables the form factors depend on are the modulus of the heavy quark three-momentum |p| and the total nonrelativistic energy of the quark-pair E.
The LL expressions read The precise definition of the mass parameter m in the renormalization scales µH = hm and µS = hmν is not important physically since h and ν are varied anyway. In Section 2.4 we will use the 1S mass M 1S t instead of the pole mass m in the parameterization of µH and µS for convenience. 8 At higher orders (beyond N 3 LO) there will also be contributions associated with other partial waves.
For later reference we also expand the NLL form factors to first order in α s : (2.24) The first two terms in these expressions are the expansions of Eqs. (2.21) and (2.22) and the last terms come from the matching coefficients in Eqs. (2.15) and (2.16), respectively.

Inclusive cross sections
For validation purposes we will often refer to the direct analytic computation of the tt threshold production cross section in NRQCD via the optical theorem and compare to its results. These predictions are only valid in a small √ s range of a few GeV around the threshold, but have already reached NNLL [11] (see also Refs. [36,51]) and N 3 LO [5] level.
Here we shall briefly review the vNRQCD NNLL calculation following Refs. [11,51]. The total inclusive cross section for tt production can be written as where the prefactors f v and f a account for tree-level γ and Z exchange and are given e.g. in Ref. [51]. In the full theory (SM), the vector/axial-vector R-ratios can be computed via with q = ( √ s, 0) and j v/a µ being the vector/axial-vector currents that produce a topanti-top pair. In the effective theory these currents are replaced by their nonrelativistic counterparts and up to NNLL we have Here the effective current correlators are defined by where the top-pair is produced and annihilated both at zero distance. Taking the imaginary part in Eq. (2.27) corresponds to cutting through these zero-distance correlators in all possible ways. The currents operators O p,1 and O p,3 are given in Eqs. (2.5) and (2.6) and O p,2 is the subleading S-wave current suppressed by two powers of v as given in Ref. [51]. The calculation of the A i correlators is detailed in Refs. [10,51,59] and based on Toppik [15] as far as the (Coulomb) contributions to A 1 at leading order in v are concerned. For NNLL precision of the total cross section the NNLL expression for the current coefficient c 1 [56] and the LL expressions for c 2 and c 3 [51] are required. In Ref. [19] this approach to compute the total cross section was extended to allow also for (moderate) cuts on the reconstructed top invariant masses. We will validate our threshold resummation implementation in Whizard against such inclusive threshold predictions with invariant mass cuts in Section 4. Concretely, we compute "analytic" 9 inclusive cross sections with top invariant mass cuts at (N)LL using the formula (2.29) As for the total inclusive cross section in Eq. (2.25) the axial-vector current (P-wave) first starts to contribute at NNLL. The form factor F A is therefore not required here. For our validation purposes, we only consider the top decay at LO, i.e. Γ t = Γ LO t . The integration in Eq. (2.29) is over the nonrelativistic invariant mass variables which represent the nonrelativistic expansion of the top and anti-top off-shellness variables p 2 1,2 − m 2 . Here, p 1,2 , E 1,2 and ±p are the four-momenta, the kinetic energies and threemomenta of the top and the anti-top, respectively. In general we have E 1,2 = p 2 2m since the resonant top and anti-top quarks can be off-shell. The integration region in (t 1 , t 2 )-space is given by The first inequality in Eq. (2.31) defines the phase space cut and the second one represents a kinematic constraint. In order to relate the Λ-cut on t 1,2 to the (usual) cut on the relativistic invariant masses, The ellipses stand for terms suppressed by additional powers of ∆ mt /M 1S t and M 1S t v 2 /∆ mt , while the mass parameter M 1S t ∼ m is defined in the next section.

Top mass parameter
It is well known that the pole mass parameter m t in QCD is plagued by an intrinsic renormalon ambiguity of O(Λ QCD ), see Ref. [60,61]. In order to avoid an unnecessary theory error one should therefore use a suitable renormalon-free short-distance mass parameter as input in calculations that strongly depend on the heavy quark mass. This is particularly important for tt threshold production, where the shape of the cross section (peak position), is very sensitive to the top mass.
In this work we employ the 1S top quark mass scheme [15], where the top mass is defined as half of the mass of the fictitious 3 S 1 toponium ground state for stable top quarks. The pole mass can then be expressed order by order as a function of the 1S mass parameter M 1S t . At (N)LL + QCD-(N)LO the relation we need is [51] and For our (SM) process the color constants are C A = 3, C F = 4/3, T F = 1/2 and we use n f = 5 for the number of active fermion flavors throughout this work. We note that there is a mismatch in the order counting for short-distance masses suitable for the top-anti-top threshold and in the high energy continuum. This can be seen in Eq. (2.34) for the definition of the 1S mass, where ∆M starts with O(α 2 s ). This is in contrast to the definition of the MS mass which is suitable at high energies and starts to differ from the pole mass at O(α s ). So, using the 1S mass in the continuum at very high energies far above threshold is not appropriate. This problem can be resolved [45] by using a subtraction-scale dependent short-distance mass that interpolates between relativistic and nonrelativistic counting such as the MSR mass [62,63]. A full implementation of this approach in our MC framework is postponed to future work.

Renormalization and matching scales
As outlined in Section 2.1 the vNRQCD expressions for the form factors in Eqs. (2.19) and (2.20) depend on the hard matching scale µ H and the soft and ultrasoft renormalization scales µ S and µ US , respectively. 10 In Section 2.1 we have already introduced the matching parameter h and the subtraction velocity ν. The three matching/renormalization scales can be parametrized by the terms h and ν [11], respecting the natural correlation between soft and ultrasoft scales: By setting h ∼ 1 and ν ∼ |v| we make sure that large logs ∼ (ln v) n are resummed in the (N)LL form factors. Following [11], we define the (energy-dependent) default value for ν: . In Ref. [11], a detailed study of uncertainties of the total cross section (with invariant mass cuts) at (N)NLL has been performed. In the present work, we will adopt the same conventions for the combined scale variations. These will play a major role in assessing the theoretical uncertainties of our results, see Section 5.3. To this end, we define the renormalization parameter f by ν = f ν * .
The h variation is equivalent to a correlated variation of all three scales, while f variation only affects the soft and ultrasoft renormalization scales allowing the ultrasoft scale µ US to take values between 1/2 and 2 times its default value. These constraints maintain the natural scale hierachy, and the small offset (0.05) in Eq. (2.39) ensures that the renormalization scales always remain in the perturbative regime: µ S > µ US > 0.01 M 1S t . Within the form factor expressions the strong coupling constant is evaluated at the three different scales µ H , µ S and µ US . For convenience we therefore define (2.42) The hard coupling α H is determined from α s (M Z ), which is an input parameter in our calculation, using three-loop running (n f = 5). The soft coupling α S is then obtained by running down from µ H to µ S using one-and two-loop running for LL and NLL precision, respectively. Analogously, we use one-loop running for the LL ultrasoft coupling α US , which enters the NLL form factors via the NLL current coefficients c NLL 1,

Implementation in WHIZARD
In this section we discuss how to implement the threshold summation encoded in the (N)LL form factors in Eqs. (2.19) and (2.20) into the framework of an event generator in order to combine it with a fully differential fixed-order continuum calculation. For this goal we use the multi-purpose event generator Whizard [64], which is a universal MC generator for lepton and hadron colliders. It has its own completely general matrix element generator for tree-level matrix elements, O'Mega [65,66]. The usage of an adaptive multi-channel phase space generation [67] allows the integration and simulation of complex hard processes with up to ten particles in the final state. Color quantum numbers are handled via the color-flow algorithm [68]. Two different parton shower algorithms (k T -ordered and analytic) [69] are available, while hadronization and hadronic decays have to be simulated with interfaced external tools.
Whizard is particularly well suited for the simulation of linear collider events as it allows for completely arbitrary beam polarization, has fully-inclusive soft photon corrections to all orders Ref. [70,71] and hard-collinear photon corrections up to third order Ref. [72,73] implemented in terms of an initial-state (ISR) beam function, and allows for the simulation of classical beamstrahlung. The latter -originating from ultra-collimated electron and positron bunches -is simulated via the dedicated CIRCE [74] subpackages. For precision simulations at QCD-NLO, Whizard uses virtual amplitudes from external one-loop providers (OLP) like OpenLoops [75], GoSam [76,77], or Recola [78,79]. Whizard uses the FKS subtraction formalism [80,81] which is completely automatized, also allowing to preserve resonance masses in the subtraction following the algorithm in Ref. [82]. The QCD-NLO capabilities of Whizard have recently been demonstrated in the fully off-shell leptonic top decays with and without additional Higgs radiation in e + e − collisions [33], building on earlier calculations for LHC physics [83,84]. In fact, for QED corrections matching between fixed-order and the structure function in the initial state have been implemented in Ref. [85].
Besides its extensive capabilities for simulating precision SM processes, Whizard supports many extensions beyond the SM (like e.g. supersymmetry, composite Higgs, Little Higgs etc.) either by direct implementations or via its interface to tools operating at the Lagrangian level like FeynRules [86].

General remarks on the implementation of the form factors
We implement and study different variants of embedding the form factors of the last section in the relativistic computation. Let us first remark that we can in principle modify the vectort γ µ t and axial-vectort γ µ γ 5 t couplings to the A and Z fields directly in the tree-level matrix element by multiplying them with the corresponding nonrelativistic form factors. This is a straightforward modification, and we refer to it as the signal diagram method. As explained in more detail in Section 3.2, in Ref. [49] an improved version of the signal diagram method was adopted to implement the nonrelativistic S-wave form factor.
However, this modification manifestly breaks gauge invariance for resonant (and offshell) top production already for the naive insertion in the tree-level matrix elements due to gauge cancellations among the tt signal diagrams (resonant) and the rest of the diagrams (non-resonant) describing W + bW −b production. The problem can also be seen at the Lagrangian level. The term in the SM Lagrangian containing the covariant derivativet (L) i / D t (L) , which includes thet γ µ t (and thet γ µ γ 5 t) vertex, is invariant under local gauge variations t → e −iQtgθ(x) t. This is no longer the case when we naively insert the nonrelativistic form factors for the vector and axial-vector currents and thus modify the covariant derivative (unless the tops are on-shell and the derivative cancels the mass term by construction).
It is preferable to multiply the form factors to a gauge invariant quantity. Let us consider a schematic factorized ansatz for the S-matrix element of double-resonant top pair production: Here the form factor only enters the on-shell production matrix element M prod and QCD-NLO corrections to the decay can be computed separately. We note that in Ref. [48] a similar ansatz was employed to study the CP properties of the Higgs boson from lepton angular correlations in e + e − → ttH → ( νb)(¯ νb)H for √ s = 500 GeV. We discuss the exact realization of Eq. (3.1) in Section 3.3, after a more general discussion of the possible violations of gauge invariance in the treatment of unstable particles in Section 3.2.

Possible violations of gauge invariance
The treatment of unstable particles such as the top quark is technically challenging from a perturbative point of view. The Breit-Wigner distribution of their invariant mass in the resonance region is a result of resumming absorptive self-energy corrections to the propagator, called Dyson summation. This procedure mixes perturbative orders from the point of view of a strict Feynman diagrammatic expansion. As gauge invariance can only be guaranteed order by order, the associated Ward, Slavnov-Taylor and Nielsen identities may be violated if the summation is carried out naively as the off-shell particle self energies are in general not gauge-invariant. While these violations are associated with higher orders, they can be made arbitrarily large by applying extreme gauge transformations [87].
Theoretically the problem is resolved by considering a simultaneous expansion in the couplings and the off-shellness q 2 − m 2 in the resonance region set by the particle width and by Dyson summing only gauge-invariant portions of the self energy. As the approach requires an additional expansion, the predictions unavoidably acquire an additional scheme dependence that is, however, of higher order and therefore consistent with the perturbative expansion. A possible guiding principle to identify the gauge-invariant parts of the unstable particle self energy is the fact that the complex pole p 2 = µ 2 , where µ 2 = m 2 − imΓ, of the propagator of an unstable particle is a gauge-invariant quantity [88,89]. This property is the basis of two approaches that we will use in this work: the complex-mass scheme and the pole approximation.
For calculations that involve complete matrix elements (i.e. not the factorized parts constructed according to Eq. (3.1)), intermediate unstable particles are treated in the complex-mass scheme [90,91]. The idea behind the complex-mass scheme is to add and subtract the width Γ (defined through the complex pole) in the bare Lagrangian. While one of the terms is absorbed in the complex renormalized mass definition, the other one adopts the role of a complex counterterm that is treated perturbatively order-by-order. 11 This leads to a gauge invariant treatment of finite width effects, as Ward or Slavnov-Taylor identities are exactly respected, while maintaining perturbative unitarity [92]. The pole approximation is used for the parts of the calculation that are based on the factorization ansatz in Eq. (3.1) and will be discussed in Section 3.3.
Aside from problems associated with the implementation of the width close to a particle resonance, it is essential to treat the resonant signal and the non-resonant background diagrams in a coherent fashion such that the respective gauge cancellations between them can take place. In fact, the smallest strictly gauge-invariant subset that contains the tt signal diagrams for W + bW −b production consists of all diagrams. In the language of Ref. [93], such a subset is called a grove and can be systematically constructed. Modifications of the signal diagrams such as the simple attachment of a form factor will, therefore, in general spoil gauge invariance as we have already pointed out above. As we show in Section 3.5, at tree-level these gauge cancellations become gigantic at high energies. In the threshold region, the dominant gauge cancellation concerning the signal diagram method is associated to contributions originating from the matrix elements and phase space integrations related to off-shell top and anti-top quark decays. This can also been seen from the fact that the imaginary part of the off-shell top quark self energy is gauge-dependent. 12 In Ref. [49] this imaginary part was used to construct a global correction factor that allowed to define modified signal diagrams that lead to gauge-invariant results at least for observables that are inclusive concerning the top and anti-top quark decays. We note, however, that when QCD-NLO corrections are considered, the gauge cancellations are uncontrollable and large at any energy as they also affect the structure of the QCD infrared divergences. At this level the signal diagram method becomes practially meaningless. For this reason and in order to also achieve manifest gauge-invariance for observables that are differential concerning the top and anti-top quark decays we incorporated the nonrelativistic form factors within a factorized approach as explained in more detail below.
To conclude this section, we remark that for the full SM tree-level electroweak diagrams used in this work we employ unitary gauge, while for determination of the QCD-NLO corrections we adopt t'Hooft-Feynman gauge for the gluon lines.

Factorization in the Double-Pole Approximation (DPA)
Different approaches to treating unstable particles close to and above threshold have been compared in Ref. [94] for W W production. Above threshold, the differences between the approaches in unitary gauge have been found to be at the per cent level. We will discuss here three of these methods and refer to them in the following as the narrow-width approximation (NWA), the signal diagram (SD), and the pole approximation (PA). The 11 The approach can be generalized by adopting other gauge-invariant definitions of Γ which differ from the complex pole definition by higher order terms. 12 We note that the real part of the off-shell top quark self energy is gauge-dependent as well. However, this contribution is part of the renormalized hard electroweak corrections to tt production and constitutes NNLO matching corrections to the nonrelativistic current Wilson coefficients c1 and c3 in Eqs. (2.15) and (2.16), respectively.
-18 -NWA is based on the on-shell tt production cross section and results in the simple formula σ = σ prod · BR, where BR is the corresponding on-shell particle branching ratio. The NWA is gauge-invariant, allows for factorizable corrections but incorporates no off-shell behavior and defines no cross section below threshold. The latter property is inherited from the on-shell production cross section σ prod . The SD approach singles out the resonant signal diagrams and includes the decay width in the unstable particle propagators. In the SD approach the signal process can be evaluated with off-shell tops and is finite below threshold. However, it is neither gauge-invariant nor suitable for computing QCD-NLO corrections, as already pointed out in Section 3.2. In fact, we have encountered negative cross sections when taking QCD corrections to the signal diagrams far above threshold into account. This is yet again related to the electroweak gauge-dependence of the signal diagrams.
The pole expansion scheme [87,95] was one of the first schemes to enable gaugeinvariant computations for kinematics where unstable particles are described close to resonance. To this end, a Laurent expansion of the full scattering amplitude is performed in the unstable particle off-shellness around the unstable particle pole keeping the pole and a number of higher-order non-resonant terms that vanish on-shell. The location of the pole in the complex plane, the pole residue and the non-resonant terms are by construction gauge invariant. In the PA [96], one drops the (higher-order) non-resonant contributions as a first approximation for computational simplicity. Thus, in the PA the gauge-invariant denominators of the unstable particle (Breit-Wigner) propagators are equivalent to the SD approach, but the numerators are constructed to be gauge invariant as well. This is achieved by projecting the (anti)particle momenta onto a suitable on-shell configuration in the calculation of the matrix elements, which allows to express the numerators as (fermion) spin sums projecting onto the corresponding on-shell (anti)particle state. The property of gauge-invariance at the amplitude level then follows directly from the gauge-invariance of on-shell production and decay matrix elements. The required on-shell projections cannot be uniquely defined and introduce an ambiguity in the results, which is of the order of the neglected non-resonant terms, i.e. O(Γ/m) [97]. -19 -In our case, we have to deal with two top-quark resonances and thus have to use a double-pole approximation (DPA) [98,99]. Diagrammatically, we depict the factorized computation as shown in Figure 2. The factorized matrix element in this approximation can be written as where h t , ht are the polarizations of the top quark resonances and µ 2 t = m 2 t − im t Γ t is the complex top quark pole, respectively. In our work we choose the top polarization states to be helicity states. The term {p} denotes a set of momenta that have been projected on-shell (concerning the top and anti-top quarks) such thatp t 2 ,p 2 t = m 2 t , while {p} represents the set of physical momenta with p 2 t , p 2 t = m 2 t in general. So, in M fact physical off-shell top momenta are still used for the top propagator denominators (and the event output). The details of the projection procedure are discussed in Section 3.3.2. We note that if p t and pt are on-shell, we can use the fermion spin sums over top (u) and anti-top (v) spinors to show that Eq. (3.2) is identical to the signal diagram at tree-level. We have used this property to verify the correct implementation of the factorized matrix elements: Within the numerical precision of the (complex) amplitudes we found perfect agreement for a given on-shell phase-space point and given external helicity.
Considering the fully relativistic four-body phase space, which probes also all possible top and anti-top quark off-shell regions, we stress that Eq. (3.3) and therefore the equality of M fact and signal diagram does in general not hold. Nevertheless, as we also show in Section 3.5 in the absence of QCD-NLO corrections, computations with signal diagrams are numerically quite close to the ones with M fact at least in the threshold region. When accounting for QCD-NLO corrections, however, using signal diagrams leads to results that are far off the correct physical predictions, as already mentioned above. The crucial aspect of the DPA/PA is that factorizable and nonfactorizable corrections are separately gauge-invariant. The latter arise e.g from the crosstalk among the top, anti-top decay and production subprocesses via (ultrasoft) gluon exchange. Depending on the (inclusiveness of the) observable the factorizable corrections to the production and decay matrix elements are often dominant. We will neglect nonfactorizable corrections when implementing the threshold resummation for Whizard predictions in this work, see Section 3.6.1. In passing, we note that the nonfactorizable one-loop corrections due to soft photon exchange are universal and have been given analytically for any number of unstable particles in Ref. [100].
The DPA expression in Eq. (3.2) is our preferred setup to single out the tt resonant production part for the W + bW −b cross section and will also be referred to as factorized, on-shell evaluated. Off-shell evaluated would correspond to replacing {p} with {p} and is only used as a test in sections 3.5.1 and 3.5.2. To include the nonrelativistic S-and -20 -P-wave form factors in this approach, we multiply them with the corresponding factorized production matrix elements. Since the nonrelativistic form factors are gauge-invariant, the resulting W + bW −b amplitudes are still factorizable and gauge-invariant. In this setup it is then straightforward to also include (hard) QCD-NLO corrections to the on-shell decay, as described in Section 3.6.1.

Helicity correlations
Concerning helicity correlations, Eq. (3.2) can be considered as complete as possible. This implies that in general there are also interference terms in the top and anti-top density matrices that arise in the squared factorized matrix element, where h t = h t or ht = h t in the helicity basis. This expresses the fact that the top and antitop quark polarizations in general differ from the helicity eigenstates. These off-diagonal contributions in the spin density matrices are known to be sizable at threshold [14,38], but at the implementation level, they are currently not yet available in the one-loop provider OpenLoops, which we use to obtain the QCD-NLO virtual corrections for the top decay.
For the time being this is a limitation, which we intend to lift with future releases of OpenLoops. From Whizard, by default events at QCD-LO for the factorized processes are generated using the full spin correlations. But there is also the option to switch off the off-diagonal entries in the helicity basis completely, in accordance to the available terms from OpenLoops at QCD-NLO. For spin-independent observables the off-diagonal entries in the spin density matrices are irrelevant and the current implementations (also described in the next paragraph) provide the correct (and equivalent) results. For testing purposes in comparisons with previous spin-independent results, we thus define the following helicity approximation (HA), which covers the diagonal correlation entries (for h t = h t or ht = h t ), but neglects all off-diagonal entries (for h t = h t or ht = h t ): (3.5) The HA allows for predictions where in addition to the measurement of interest the top and anti-top helicities are measured as well. We may neglect correlations even further concerning comparisons with analytic results for the total cross section, obtained as described in Section 2.2, where predictions are made for spin-averaged states. To this end we apply an uncorrelated average to the decay matrix elements which we call the extra-helicity approximation (EHA): So, in the EHA all spin correlations between production and decay are removed entirely. We stress that for physical applications, in all the above approximations, it is implied that the external helicities of e + e − → W + bW −b are summed over, as we only investigate unpolarized configurations in this paper.

On-shell projection
A generic algorithm and formulae to obtain on-shell projected momenta for any number of resonances can be found in Ref. [100]. In our case, the expressions simplify tô where e t = p t / p t is the top flight-direction in the collision system determined from the original final state momenta of the process. Again, we denote on-shell projected momenta with a hat, and they fulfill by definition the on-shell conditionŝ Our definition of e t guarantees that the projection leaves final state momentum configurations that correspond to on-shell top quarks unchanged. It also maintains spatial correlations, which are important for example for the forward-backward asymmetry. Furthermore, it is crucial to retain the three-momentum directions for the interference terms of the factorized with the full amplitude in Section 5.1. We note that the projection in Eq. (3.7) cannot be applied literally in this form over the whole physical kinematic range. Below threshold, √ s < 2m t , it is meaningless as it would yield complex momenta. Thus, it is only defined for √ s > 2m t . The resummed computation, however, reaches its peak at √ s ≈ 2M 1S t < 2m t . Therefore, we define the extended DPA as follows: For √ s > 2m t , the extended DPA is identical to the normal DPA.
For √ s ≤ 2m t , we project to a set of momenta that correspond to √ s = 2m t + , where is a very small number that is introduced to avoid potential numerical instabilities in the matrix elements, that occur forp t = 0. The three-momentum direction e t is defined in the same way as above threshold, i.e. it is unchanged. This extended DPA transitions smoothly into the normal DPA at threshold and, crucially, provides finite and gauge-invariant results below threshold.
Having definedp t andpt, we also have to project the momenta of the decay products. Let us e.g. discuss the case of the top decay. At the Born level, this is a simple 1 → 2 decay with the well-known kinematics [100,101] in the frame wherep t = (m t , 0), called the top rest frame in the following. Analogous to the top three-momentum, we choosep W = p W e W , where e W is the actual direction of the W + in the top rest frame for the given kinematics (generated by Whizard). Note that this also ensures that the flight direction of the b quark is unchanged in the top rest frame. These momenta are then boosted back to the collision (lab) frame. The analogous procedure is applied to the anti-top decay products. The QCD-NLO case, involving the 1 → 3 decay with an additional gluon, is discussed in Section 3.6.

Input parameters
Here we list the SM quantities that enter our calculation as input parameters together with their default values we use throughout this work. The top 1S mass plays a special role in this work and we choose Unless stated otherwise the other default values are taken from Ref. [102]. Namely, we use for the vector boson, Higgs and bottom quark pole masses 13 , while the electron, the muon, as well the first two quark generations are treated massless. Besides that, we have the W /Z and Higgs boson widths entering the particle propagators in the complex-mass scheme. For the CKM matrix we assume the unit matrix, which for the most relevant element of our computation V tb is consistent with the measured value (1.021 ± 0.032 [102]). Furthermore, we set the initial value for the strong (running) coupling constant to [103] α s µ = m Z = 0.118 , (3.14) and fix the electromagnetic coupling to its value in the G µ -scheme: Using the precisely measured value for the Fermi constant, G µ = 1.166 378 7 × 10 −5 GeV −2 , absorbs the dominant electroweak corrections to the top decay. As in this work we treat electroweak effects at the LO level, it is advisable to use a scheme where the electroweak corrections to the top decay are small. This is the case for the G µ -scheme [104]. Of course, this choice is not fully capturing the production dynamics, where a √ s-dependent electromagnetic coupling α em µ = √ s , which absorbs oblique vacuum polarization corrections, is more appropriate [25,105]. Note, however, that the relative difference of α em,G = 1/132.233 to α em µ = 2M 1S t = 1/125.924 (which is roughly 5 %) is smaller than to the Thomson limit 13 As we work at LO in the electroweak couplings, we directly take the numerical values of the Breit-Wigner masses given in Ref. [102] as (real) pole masses and transfer them to the complex-mass scheme. The numerical differences are negligible for our purposes.
As noted in Section 3.2, we will use the complex-mass scheme for the QCD-LO and QCD-NLO SM (and top decay) matrix elements. That is, we introduce complex-valued renormalized masses which implies a complex-valued weak mixing angle We use Eq. (3.17) for explicit values of the weak mixing angle e.g. in charged and neutral current couplings. We will, however, refrain from using the complex-valued mixing angle to define α em and rather use the (real-valued) definition of Eq. (3.15). This allows us to use one consistent definition in all parts of our calculation: It is consistent with all calculations in this work being at leading order in the electroweak couplings, and retains the gauge-invariance properties of the complex-mass scheme.
With the above input we can translate the 1S top mass to the respective pole mass at a given precision according to Section 2.3 and Section 2.4. The relation between the two masses, Eq. (2.34), depends (also via α S ) on the renormalization parameters h and ν = f ν * with ν * as defined in Eq. (2.39), which in turn depends on the c.m. energy √ s. So, using our default scale choice h = f = 1 we obtain e.g. at threshold (thr) for √ s = 2M 1S t = 344 GeV: The (N)LL soft coupling α S , see Eq. (2.42), which enters these numbers, is evaluated as described in Section 2.4. Of course, the same √ s-dependent (N)LL top pole mass is used for the (N)LL form factors, the corresponding factorized on-shell production/decay (N)LO matrix elements, as well as the QCD-(N)LO fixed-order cross sections.
Also the QCD-(N)LO top decay width depends on the (N)LL top pole mass. For fixed renormalization parameters h and f it therefore also varies slightly with √ s and is (like the pole mass) automatically recomputed by Whizard whenever √ s changes. At NLO the top width also depends on the QCD coupling renormalization scale we use for the fixed-order QCD-NLO cross section computation, see Section 5. Similar to the hard NRQCD coupling α H in Section 2.4, we determine α F by three-loop RG running from µ = m Z to µ = µ F with n f = 5 active flavors. At threshold ( √ s = 2M 1S t = 344 GeV) we then end up with By evaluating fixed-order QCD matrix elements (in particular also for the top decay in the factorized computations) and the top decay width at the same perturbative order, we guarantee that t → W + b(+gluon) branching ratios obtained from the differential final state computations remain consistently equal to unity at QCD-(N)LO upon full integration over the decay phase space, as recently demonstrated in Ref. [33].

Validation of factorized approach
In this section we are going to validate the factorized setup for the threshold calculation. This includes checks concerning the high-energy behavior (in connection with potential inconsistencies related to gauge-dependent terms), the behavior of the factorized process around the threshold, the consistency of the complex phase in the different pieces of the interference terms, as well as the properties of the factorized amplitudes under charge conjugation and parity transformations. If not stated otherwise, all results shown in this subsection are obtained with a tree-level form factor of unity, LO top decays, and setting m t = M 1S t .

High-energy behavior
Comparison of tt descriptions, high √ s behavior The high-energy behavior of total cross sections can serve as a test for gauge-invariance. As known for example from W W production [106], the total cross section is expected to fall off with √ s in the high-energy limit due to perturbative unitarity. This may not be the case, if it contains unphysical gauge-dependent terms. In Figure 3, we compare different -25 -factorization approaches implemented in Whizard. The full W + bW −b LO cross section (red) serves as reference in this case, as it is gauge-invariant and valid below threshold.
One can see that while it agrees with the cross section for on-shell tt production (blue) around threshold up to ∼ 10 %, the tt curve falls off faster with energy and constitutes only 50 % of the full W + bW −b cross section at 2 TeV, in full agreement with the corresponding results shown in Ref. [26]. At high energies, the tt cross section is much smaller than the total cross section for the final state W + bW −b because the latter gets sizable contributions from single-top and non-tt-resonant processes which are missing in the tt cross section.
We furthermore see that using the gauge-dependent tt signal diagram only (green) instead of the sum over all diagrams for W + bW −b production leads to an unphysical rise of the cross section at high energies. The same holds for the cross section based on the factorized matrix element, Eq. (3.2), evaluated with off-shell momenta (orange), i.e. without the onshell projection of Section 3.3.2. The difference between the latter two descriptions, which both show unphysical behavior and should not be used due to unitarity violation, is related to the fact that Eq. (3.3) only holds on-shell. Finally, we have two descriptions that closely follow the tt curve for high energies: the factorized computation with on-shell momentum projections in the decay and off-shell momenta in the production matrix element (dashed cyan) and the one with on-shell momentum projections for both production and decays (dashed purple), which is our default approach. The similarity of the latter cross sections indicates that the unphysical high-energy behavior arises from using off-shell momenta for the decay matrix elements. This reconfirms the approach used in Ref. [49] to construct their global correction factor to eliminate the gauge-dependence of the signal diagrams.
While it is generally known that evaluating only signal diagrams leads to gaugedependent results, our numerical analysis shows the impact of the associated numerical effects. Given the large (unphysical) differences between the SD curve (green) and the full cross section (red) observed in Figure 3, we conclude that using signal diagrams, particular at TeV energies, leads to unreliable results and should be avoided in general.

Threshold region
The considerations in the last paragraph were instructive to understand the numerical impact of gauge dependence for the high-energy behavior. However, in this work we actually only rely on the factorized computation within the threshold region for the construction of our matched cross sections, see Section 5. In Figure 4 we therefore examine the different prescriptions once again zooming in a 100 GeV window around threshold. Obviously, the on-shell tt cross section (blue) is only finite above 2m t and therefore not discussed further in the following.
We see that 10 GeV above threshold and higher all other curves are quite close to each other and moreover agree with the full W + bW −b cross section (red) to within 4 % or better. This indicates that the relative effects from gauge-dependent terms are very small in this region and that the difference to the full W + bW −b cross section predominantly arises from non-signal diagrams, i.e. diagrams without the (virtual) tt pair. Below threshold, on the other hand, where all cross sections decrease strongly and a double on-shell configuration is kinematically not allowed anymore, the other approaches deviate significantly from the full -26 - Comparison of tt descriptions, around threshold W + bW −b cross section. Whereas the cross sections based on the signal diagram (green) and the off-shell evaluated factorized matrix elements (orange) fall off quickly far below 60 % of the full W + bW −b cross section, the factorized descriptions with on-shell projections for the decay (dashed cyan and purple) are also smaller than the full W + bW −b cross section but remain relatively close. We observe that the non-signal W + bW −b production diagrams in general yield positive contributions. The behavior of the results also indicates that the unphysical gauge-dependent off-shell effects are quite dramatic below threshold and should not be used for predictions. In this context, employing off-shell evaluation for the decay matrix elements represents a much bigger (unphysical) effect than for the production matrix element. In any case, the cross section computed in the proper (extended) DPA (dashed purple), which consistently applies on-shell projection for production and decay matrix elements, is closest to the full W + bW −b result and thus provides the best factorized approximation. From now on we will stick to this method of factorization and simply refer to it as the factorized approach.

Helicity correlations
In Figure 5, we show the effects of using different approximations concerning the top-antitop spin density matrix for the total cross section in the threshold region calculated from -27 - our factorized approach, as discussed in Section 3.3.1. The total cross section is displayed using the full spin correlations according to Eq. (3.4) (red), using only the diagonal spindensity entries in the helicity basis for the top and anti-top wave functions, called helicity approximation (HA), according to Eq. (3.5) (dashed blue), and using spin summation and averaging for the production and decay, respectively, called extra-helicity approximation (EHA) according to Eq. (3.6). Since all three approaches lead to equivalent results for spin-independent observables we expect that the results for the total cross section agree. We see that this is true at the level of the numerical precision of our analysis for the full spin correlation (red) and the HA results (dashed blue). On the other hand, for the EHA (dash dotted green) we find a notable relative difference to the other two approaches at the level of up to a few permille below threshold. The discrepancy is a numerical artefact of the combination of spin-averaging and on-shell projection within the current implementation. This is, however, irrelevant phenomenologically since below threshold the contributions from non-resonant W + bW −b production are much larger than this discrepancy, cf. Figure 4.
In any case, as our default, we use full spin correlations in all parts of our factorized calculations except for when QCD-NLO corrections to the top quark decays enter, where we use the HA for the reason explained in Section 3.3.1.
-28 -  The corresponding ratio is ill-defined at the two zero crossings, but is still shown to visualize quantitative differences, especially below threshold.
In Section 3.5.2 we have verified that the factorized (total) cross section in the DPA represents a good approximation for our purposes in the threshold region. The matched cross sections we construct in Section 5 will in addition include interference (IF) terms between factorized and full QCD-LO matrix elements. It is therefore necessary to also examine the relative complex phases of the involved matrix elements. In this subsection we compare inclusive (i.e. fully phase-space integrated) IF contributions among QCD-LO matrix elements obtained in the factorized DPA and signal diagram (SD) approach as well as from the full Born-level SM calculation. For example, the IF term between full and factorized matrix element reads 2 Re[M fact · M * full ] and is integrated over the total W + bW −b PS. By also considering M fact multiplied in addition with the corresponding nonrelativistic form factor, which contains a non-trivial QCD phase, we can fully examine the relative complex phase of this inclusive IF contribution. Moreover, we illustrate the effects of Lorentz boosting the momenta in the individual parts of the factorized amplitude. The factorized matrix elements studied here include full spin correlations between top production and decay, cf. Eq. (3.4).
In Figure 6 the inclusive IF terms are shown as a functions of √ s. In the left panel only QCD-LO matrix elements are evaluated, i.e. without accounting for the effects of the -29 -nonrelativistic form factors. The solid red line represents the full matrix element squared (times a factor of two) and serves as a reference curve. The signal-full IF term is represented by the solid blue line. This term is around 8% below the red curve above threshold and substantially smaller below threshold where the IF terms becomes small. The difference comes from the non-signal diagrams contained in the full amplitude. The factorized-full (dashed green) and factorized-signal (dashed orange) IF terms are practically equal and very close to the signal-full IF curve. This is perfectly consistent with our findings for the factorized and signal cross sections in Section 3.5.2, cf. Figure 4, and shows that (the implementation of) our factorized approach based on the DPA does not lead to additional large relative complex phases. The factorized total cross section can be cast into the EHA form in Eq. (3.6) without loss of generality, and thus factorizes into separately Lorentz-invariant and spin-independent parts associated with top production and decay. For the factorized total cross section we can therefore independently boost the momenta in the different parts even without transforming the spin-vectors (spinors) accordingly. This is, however, not the case on the amplitude level and in particular when spin correlations between the production and decay matrix elements are taken into account. This can be illustrated considering the resulting IF terms with the full (and unmodified) matrix element. For example, the effect of boosting all momenta and wave functions in the decay matrix elements to the top rest frame, while all momenta and wave functions in the production matrix elements remain in the lab frame, is demonstrated by the curves represented by the + symbols and labeled "factorized rest frame" in Figure 6. We find that the resulting inclusive IF contributions with the full (cyan crosses) and signal (purple crosses) matrix elements differ completely from the corresponding IF terms using the correctly factorized ("factorized boosted") matrix elements with all momenta and spins consistently in the same Lorentz frame. This is due to an additional unphysical and inconsistent relative complex phase that arises in the "factorized rest frame" amplitude. We stress that here we have only performed the inconsistent boost on the decay matrix element in order to illustrate the dramatic effect of an (admittedly extreme) incorrect implementation of the factorized matrix elements. As was also demonstrated in Ref. [85], maintaining the correct phase between production and decay matrix elements is essential.
In our matching procedure explained in detail in Section 5 we eventually multiply the factorized matrix elements with the corresponding nonrelativistic S-and P-wave form factors. Because the nonrelativistic form factors are sizable and strongly varying concerning modulus and complex phase within the threshold region this substantially modifies the behavior of the factorized amplitude close to threshold. In the right panel of Figure 6 we show the effect of multiplying the QCD-LO correctly factorized and signal matrix elements with the tt vector current with F LL V − 1. The '−1' subtraction is introduced in Section 5 to avoid double counting, and we therefore account for it here as well. We find that the correctly factorized-full (dashed red curve) and signal-full (solid blue curve) inclusive IF terms are very similar, because the respective QCD-LO amplitudes are very similar in the threshold region as already discussed above. In comparison, using the incorrect "factorized rest frame" amplitudes instead of the correctly factorized ones leads to completely incon-sistent results (green crosses). Note that P-wave contributions to the total cross section are suppressed by v 2 and the corresponding form factor is not taken into account here. However, the conclusions just drawn apply as well to the resummed and factorized matrix elements involving the axial-vector current.

P and CP behavior
The HA of the factorized process in the DPA shown in Eq. (3.5) allows to discuss the cross section for resonant tt production with different top/anti-top helicities. This furthermore serves to validate our factorized approximation with respect to P and CP transformations. For simplification, we disable at first the contribution of the s-channel Z exchange and only consider photon-initiated top-pair production. The interference between Z and photon exchange is well-studied and causes the forward-backward asymmetry in top-pair production at lepton colliders [107][108][109]. Disabling the Z exchange allows us to concentrate on (the validation of) the symmetry properties of the cross section under P transformations.
In the right panel, the absolute difference σ(−, +) − σ(+, −) is shown for W + bW −b production. Note that the absolute difference is slightly asymmetric around threshold. Above threshold it grows due to the constant relative error visible in the ratio plot and the growing cross section.
In the left panel of Figure 7, we show the distribution of the inclusive LO cross section as a function of √ s for the different top helicity configurations as well as the sum over all helicities obtained from W + bW −b production in the DPA and from on-shell tt production.
-31 -Above 360 GeV the W + bW −b and tt cross sections are fairly similar. Especially the respective ratios of mixed top helicities σ(+, −)/σ(−, +) and equal top helicities σ(+, +)/σ(−, −) are both very close to unity. The tt ratios are equal to one within the numerical (MC) integration uncertainties for all energies. As the tt cross section is given by the absolute square of the production matrix element, we thus find that |M prod | 2 (−, −) = |M prod | 2 (+, +) as well as |M prod | 2 (+, −) = |M prod | 2 (−, +). This is, of course, to be expected as the electromagnetic production of fermion pairs conserves parity (P). When accounting for the top decay, however, only the combination of charge conjugation and parity (CP) is a symmetry, due to the left-handed coupling. This implies for the DPA in the HA that So just from the CP properties, we cannot infer the correct behavior of the ratio of mixed helicities for W + bW −b production in the DPA. It is thus interesting to see that for high energies the ratio still approaches unity and P becomes approximately a good symmetry. In the right panel of Figure 7, we also show the absolute difference between the contributions with mixed helicities to W + bW −b production in the DPA. It is remarkably symmetric around threshold. In principle, one could work out the exact analytic result for this (Pviolating) difference, but this is beyond the scope of our validation. Finally, in Figure 8, we also show the effect of including the production channel via Z exchange. As expected from Eq. (3.20), equal top/anti-top helicities in tt and W + bW −b production still give equal results, as also shown in the upper ratio plot. However, now there is a larger P-violating difference between the two different mixed helicities for all energies that grows with energy. This difference is now also present for tt production. We observe that the Z exchange enhances mostly the (+, −) configuration, while the effect on the other helicity configurations is comparatively small. This is also visible in the lower ratio plot where the (+, −) configuration is compared to the (−, +) one. As for the case of pure photon exchange, the contributions from mixed helicities are larger than from equal helicities. This can be understood from the fact that for massless quark production, mixed helicities are the only contributing configurations due to the spin-1 intermediate gaugebosons. Hence, equal helicity contributions only arise due to the spin flip associated with the top quark mass and are therefore less likely to occur at high energies. Concerning the ratios, we see that for equal helicities they remain at unity. For mixed helicities the ratios for W + bW −b and tt production approach each other for c.m. energies above 360 GeV as we already observed for pure photon exchange.
-32 -  Figure 7, but now with the effect of Z exchange in the production process taken into account.

NLO QCD corrections with WHIZARD
Whizard can compute fixed-order QCD-NLO corrections to various processes in an automated manner. In a recent publication [33], some of us studied the QCD-NLO corrections to e + e − → tt in the relativistic continuum with fully off-shell top and gauge boson decays. This study also included a thorough validation against other MC generators that are able to compute this process. Thus, for fixed-order QCD corrections, we can build upon a well-tested framework.
Whizard at QCD-NLO uses the FKS subtraction scheme [80,81], both in the standard approach and the resonance-aware extension [82]. The FKS scheme uses a partition of the real phase-space into regions, where only one divergent configuration exists. In each of these regions, the divergence is regulated using plus-distributions. In the FKS scheme, we can make use of the optimized multi-channel phase-space generator for the underlying Born kinematics, as the full real phase-space factorizes into Born times radiation phase-space. Starting from the Born phase-space configuration, real kinematics are generated according to the specific kinematics of each singular region.

Whizard has been interfaced to the One-Loop providers (OLPs) GoSam [77] and
OpenLoops [75] with the BLHA interface [110], and to Recola [78,79] with a dedicated interface. In addition to virtual matrix elements, we can use these interfaces to obtain color-and spin-correlated, as well as tree-level matrix elements (as alternative to O'Mega). Events can be generated at fixed-order or using the Powheg matching scheme [111]. More details can be found in Ref. [112].

Setup of fixed-order corrections in WHIZARD
As motivated in Section 3.3, we will include the form factor in the tt production matrix element using the DPA. For the decay matrix elements we compute the full set of (relativistic) QCD-NLO corrections to the on-shell top quark decay using the momentum projections dicussed in Section 3.3.2. We can therefore achieve QCD-NLO precision for observables that probe the decay kinematics. Our approach entails that we use the NLO width in the matrix elements and form factors of the matched computation when we account for the QCD-NLO corrections in the decay. In addition, we also employ the full W + bW −b fixedorder results at QCD-LO and QCD-NLO. We will now introduce a convenient graphical notation for the various components of our matched calculation that will be useful for the discussion of the matching procedure in Section 5.
For the full W + bW −b process at QCD-LO, which includes all non-signal and background processes and their interference, we use the notation which represents the modulus square of the sum of all W + bW −b tree-level diagrams and where the phase space integrations are implied. Concerning QCD-NLO contributions, αs stands for the sum of all virtual gluon exchange one-loop diagrams for the W + bW −b final state, and a   b ≡ 2 Re a · b * . In this notation the QCD-NLO fixed-order cross section is represented as where in σ LO a consistent use of the QCD-NLO corrected parameters such as the top quark mass and width is implied. The last term stands for the real radiation corrections, and in this context all infrared cancellations between real and virtual corrections are guaranteed by the KLN theorem.
In the factorized DPA approach, we use for the QCD-LO cross section the notation applying our notations for the QCD-NLO fixed-order cross section described above also for the top and anti-top decays. The diagrams in Eq. (3.25) can individually carry IR divergences which, however, cancel in the sum. Note that in the factorized approach we omit the real final-state gluon interference diagrams of the kind which correlate top and anti-top decays, as well as virtual gluon corrections of the kind including diagrams involving e.g. gluon exchange between the final state b quark and the anti-top. As we already explained in Section 1, in the threshold region, which is where we employ the factorized approach, these kinds of corrections cancel at NLL order in the total cross section [42,43], or for acceptance cuts that do not resolve the tt double resonant portion of the W + bW −b phase space [19,20]. Such corrections, however, in general contribute to differential distributions at NLL order in the threshold region [44] and have, eventually, to be included in the context of a more sophisticated approach. Thus the approach we describe in this work provides differential predictions at LL order in the threshold region. We emphasize, however, that via the matching procedure we still include the full set of fixed-order relativistic QCD-NLO corrections associated with the corresponding diagrams with offshell top quarks in an exact way. At high energies we thus have fully differential predictions at QCD-NLO precision.

Modifications to standard FKS for factorized QCD-NLO
In the following, we review the three main modifications to the FKS subtraction needed to cope with our factorized QCD-NLO computations. We focus on top production but the -35 -statements also hold for analogous processes where the DPA to Born, Real, and Virtual contributions to unstable particle decays shall be computed at QCD-NLO. We note that if one determines QCD-NLO corrections to production and decay simultaneously, one encounters ambiguities in the real radiation computation. This is due to the difference in the invariant mass of the resonance in case the radiation, carrying momentum, occurs in the production or the decay stages. Thus, if one is only interested in fixed-order corrections, hybrid schemes have been devised, where the DPA is only applied to the virtual part [113].
In the threshold region, however, the dominant corrections to the production process are already encoded in the form factors. They contain the resummed Coulomb singular terms, as already discussed in Eq. (3.1), and we do not consider real corrections to the production as (ultra)soft radiation off the nonrelativistic top quarks is of higher order.
On-shell generation of the real phase-space Like the tree-level matrix element, the (decay) matrix elements with real gluon emission have to be evaluated using on-shell projected momenta. To generate this phase-space, we use the same mappings as in resonanceaware FKS. In this approach, the real emission is generated in such a way that the invariant mass of the respective resonance is kept at its Born value, which removes mismatches between the real matrix-element and its soft approximation. Thus, starting from an already on-shell projected Born momentum configuration, obtained as described in Section 3.3.2, we apply this mapping to obtain an also on-shell projected real phase-space point. Note that, to ensure correct subtraction of soft divergences, also the real-emission FKS variables ξ and y need to be computed in the on-shell projected Born system. We stress that the on-shell momenta only enter the matrix elements and their subtraction terms but not the phase-space Jacobian. For the latter as well as for event generation, the physical (and in general off-shell concerning the top quarks) phase-space is used, which is generated alongside the on-shell case.
Decay subtraction The IR divergences in the factorized real corrections all originate from the t → bW g matrix element. It consists of two Feynman diagrams. One in which the gluon is emitted from the top quark and another one in which it is emitted from the bottom quark. Divergences can only occur in emissions from particles with on-shell momenta and zero width. Therefore, in the full W + bW −b matrix element, emissions from internal top quarks do not yield divergences, as they are regularized by the width and the virtuality. However, in the factorized approach, the gluon emission from the top quark is a singular contribution as there is no top width insertion in the virtual top line that emerges after gluon radiation. It therefore needs to be treated by the FKS subtraction. We call this additional singular region a pseudo-ISR region because its underlying kinematics in the decaying top quark decay is similar to the case of QCD initial state radiation. This way, each singular pair index (b, g) and (b, g) is associated with a pseudo-ISR tuple (b, g) * and (b, g) * , in which the gluon radiation is emitted not from the bottom, but from the top quark. This implies that in the corresponding singular region, the FKS phase-space contribution Omission of interference terms In the real matrix element, we omit interference terms between gluon emissions from different top quark legs, cf. Eq. (3.26). In consequence, we remove these interference contributions from the color-correlated Born matrix element. The same reasoning applies to the corresponding virtual corrections and their subtractions. This means that the loop matrix elements we consider do not include diagrams with virtual gluon exchange between quarks associated with different top legs, cf. Eq. (3.27). The dominant contributions from these gluons are already included in the (Coulomb) resummed form factors. Therefore, also in the soft part of the virtual subtraction terms, we leave out all terms corresponding to gluon exchange between different top quark legs. The absence of these interference terms allows to split up the FKS regions into two disjoint subsets of singular pairs, as depicted in Table 1.

Validation for the inclusive cross section
In order to validate the Whizard implementation of the combination of our factorized matrix element approach and the nonrelativistic form factors, we compare in this section the numerical MC results for the inclusive cross section with available analytic calculations in the threshold region obtained as described in Section 2.2. To avoid contributions from unphysical phase space regions contained in the nonrelativistic analytic results (which come from the expansion in inverse powers of the top quark mass), we apply a cut ∆ mt on the invariant mass of the reconstructed top momenta of the form The difference in the implementation of the cut is one source of disagreement between the MC and the analytic results. In the threshold region (and for reasonably small cuts), this difference is, however, of higher order. For the purpose of validation, in the following we only discuss comparisons involving the dominant vector-current induced cross section and the correponding S-wave form factor. The axial-vector current together with its (Pwave) form factor only contributes beyond NLL to the inclusive cross sections we consider here.
We also want to reiterate that the results discussed in this section only involve the tt-double-resonant contributions contained in the factorized matrix element approach on the side of the MC calculation, which (near threshold) corresponds to the double-resonant nonrelativistic calculation in Eq. (2.29). We thus omit the contributions related to non-ttresonant W + bW −b production discussed in Refs. [19,20,114], but note that they are included in our final matched predictions through the full QCD-LO and QCD-NLO W + bW −b cross section calculations, as described in Section 5. Furthermore, in this validation section we use the EHA in Whizard and consistently treat the top decay at LO in all analytic and MC results.  Figure 9. Comparison of analytic results (blue) with the implementation in Whizard with the factorized (red) and the signal-diagram approach (green) for √ s = 350 GeV using a LL or NLL form factor. For better orientation, we indicate here and in Figure 10 the ±5 % range in the lower ratio plots with horizontal gray lines.

Reconstructed top invariant mass scans
In Figure 9, we show the cross section as a function of ∆ mt using the form factor at LL and NLL order for √ s = 350 GeV, which is about 6 GeV above the toponium peak position. As expected, the ratios of the factorized Whizard (red) and the analytic results (blue) are nearly independent of the used form factor, because in both calculations the top -38 - decay factorizes by construction. The differences for small ∆ mt originate from a different implementation of the cuts as discussed below. At √ s = 350 GeV, both approaches yield nearly the same results for all values of ∆ mt above 10 GeV. The maximal deviation is about one percent. For comparison, we have also shown the corresponding results based on the signal diagram (green), which we already discussed to be inconsistent. We see that it yields results that are too small by up to 5% for all values of ∆ mt . For ∆ mt below 10 GeV the analytic results fall off below the Whizard results, and the relative difference reaches ∼ 10% for cuts around 1 to 2 GeV. This disagreement is due to the approximate implementation of the invariant mass cuts, Eq. (4.1), in the analytic result. The latter uses an expansion in inverse powers of ∆ mt , see Eq. (2.33), and is therefore unreliable for tight invariant mass cuts. The relatively good agreement for very high cuts of 80 GeV and beyond shows that for energies above the threshold the numerical effects of the unphysical phase space regions contained in the nonrelativistic calculation are relatively small. In Figure 10 the analogous results are shown for √ s = 330 GeV, which is about 6 GeV below the toponium peak position. In this kinematic regime the inclusive cross section is already very small and the concept of tt-double-resonant W + bW −b production loses its meaning because it is not possible to have top and anti-top quarks on-shell at the same time. Here, off-shell effects and non-resonant processes are important. Indeed, also the MC integration of Whizard does not find double-resonant configurations within a 5 GeV window at this energy. The analytic calculation, which is based on the concept of determining tt-double-resonant phase space configurations by expansions in v and inverse powers of the invariant mass cut, cf. e.g. Eq. (4.1), is therefore not expected to provide a very precise description. This is confirmed in our comparison shown in Figure 10, where we see that the analytic calculation only works well in a narrow region of ∆ mt around 40 GeV, where the precise location of this region should be considered accidental. For lower cuts the analytic computation becomes unstable and shows an unphysical (though in absolute -39 -numbers tiny) rise for values below ∼ 15 GeV. At high values of ∆ mt the analytic prediction is substantially larger than the factorized calculation indicating that the relative size of the contributions from unphysical portions of the phase space in the nonrelativistic analytic calculation is particularly large for energies below threshold. The Whizard computation, on the other hand, correctly stabilizes for ∆ mt of 80 GeV and larger as the physically correct (relativistic) phase-space does not allow for larger invariant masses at this energy. We do not want to leave unmentioned that the signal-diagram calculation is, as expected from Section 3.5.2, completely unreliable at this energy. We also remark again that for all cases the ratios are independent of which approximation is used for the form factor.

Center-of-mass energy scans
In Figure 11, we show the inclusive cross section over √ s for fixed values of the invariant mass cut ∆ mt . Continuing on the considerations of the last section, we employ a moderate (∆ mt = 30 GeV) and a loose cut (∆ mt = 100 GeV). Cross sections for a tight cut (∆ mt = 15 GeV) are shown in Appendix B. As explained above, our nonrelativistic analytic computation is only applicable for moderate cuts. To also check the implemented scale variations with the constraints discussed in Section 2.4, we have produced four curves for each cross section corresponding to the corners of the h-f region defined by Eq. (2.41): The scale variation bands shown in Figure 11 correspond to the envelope of the four associated curves. We have checked that (for the inclusive cross sections based on the analytic calculation) this procedure usually gives a very good approximation to the scale variation bands one would obtain by scanning over the complete selected h-f region (as displayed in Fig. 2 of Ref. [11]). In addition to the results based on the LL and NLL form factors, we also show the corresponding inclusive cross sections using the O(α s ) expanded NLL (S-wave) form factor given in Eq. (2.23) and evaluated with α s = α H . Yet again, the ratios shown in the respective lower panels depend only mildly on the approximation used for the form factor for energies above the peak position. However, they have a rather strong dependence on ∆ mt below the peak region due to relativistic off-shell contributions, e.g. in the top and anti-top Breit-Wigner propagators, which are contained in the factorized Whizard calculation but missing in the analytic one, where the nonrelativistic approximation is employed. These off-shell effects have a much larger relative impact in the region below the peak position, where the cross section becomes small.
For ∆ mt = 30 GeV, we observe perfect agreement between the analytic computation and Whizard with the factorized approach within a window around threshold of at least 10 GeV. For ∆ mt = 100 GeV, this range is reduced significantly due to the unphysical relativistic off-shell contributions mentioned in the previous paragraph that arise for∆ mt > 30 GeV in the analytic calculation. Notably, the behavior above threshold is not strongly -40 -   Figure 11. Comparison of analytic results with the implementation in Whizard with the factorized and the signal-diagram approach for ∆ mt = 30 GeV (left panels) and ∆ mt = 100 GeV (right panels) using an expanded, LL or NLL form factor in the upper, middle, and lower row, respectively. The bands correspond to the envelope of the scale variations mentioned in the text.
-41 -affected but the ratio of analytic over factorized Whizard results falls off for c.m. energies above about 360 GeV with approximately the same slope for both invariant mass cuts. This is due to (not systematically controlled) higher-order relativistic effects (e.g. associated with the relativistic production current or the top quark propagators) in the difference of the factorized computation and the nonrelativistic analytic result. However, we emphasize that this is not problematic as in our fully matched calculations, as explained in Section 5, the factorized results do not contribute at these energies due to the switch-off procedure. In addition to the shown validation plots, we have also cross checked at individual phase-space points that the implementations of the expanded, LL and NLL form factors are consistent within the numerical precision. Overall, we have tested that our nonrelativistic form factors are correctly and consistently embedded in Whizard. The differences to the purely nonrelativistic analytic calculation according to Section 2.2 are understood and we can rely on the implementation in Whizard with the factorized approach. This yields reliable results for all ∆ mt values and in fact allows for fully differential predictions including threshold resummation.

Matching
In this section, we discuss our approach to combine (match) the nonrelativistic cross section based on factorized matrix elements with (N)LL threshold-resummed form factors and (N)LO top/anti-top quark decays into W + bW −b (called σ full NRQCD ) with the full fixed-order QCD-(N)LO cross section for W + bW −b production including all irreducible background processes and interferences (called σ FO ). Within the approximations explained already in previous sections, we maintain all relevant interference terms between full and nonrelativistic factorized matrix elements in σ full NRQCD and we keep terms beyond the corresponding order counting wherever suitable from a practical point of view. The essential point is that our matching procedure avoids any double counting of terms simultaneously contained in the two components at their respective order.
Before discussing the details of the matching procedure, let us first remind the reader that the resummed form factors are computed based on the assumption that v ∼ α s 1. In a matched computation, which shall provide a smooth description from the threshold region up to high energies, this counting becomes more and more inappropriate with increasing c.m. energy √ s until the point where it is no longer meaningful and provides wrong results. This means that the threshold resummations by themselves do not contain any natural mechanism to smoothly transition to the relativistic counting. To construct a matching approach that provides smooth predictions it is therefore mandatory to introduce a switch-off function f s . We note that the matching procedure devised in Ref. [49] did not involve a switch-off function because the threshold-resummed form factors were combined with the full QCD-LO matrix element for W + bW −b production only. They thus argued that their approach is strictly correct at leading order, and that, formally, the QCD corrections resummed in the nonrelativistic form factors constitute terms beyond this level of approximation.
-42 -In our matching procedure the switch-off function is unity in the threshold region, vanishes in the relativistic region where the nonrelativistic calculations cannot be trusted and is monotonically falling everywhere. The implementation that we are using is specified in Section 5.2. As the detailed shape of the switch-off function is not unique, the matching procedure entails an additional source of theoretical uncertainties, which we examine. We implement our switch-off function by multiplying it to the strong couplings that enter the form factors. In this way higher orders in α s resummed in the form factors are naturally stronger affected by the switch-off than lowers orders.
In order to avoid double counting of terms contained in σ FO and σ full NRQCD we also have to define an expanded nonrelativistic cross section, called σ expanded NRQCD , that is constructed like σ full NRQCD , but contains the form factors expanded in α s to the appropriate order: The explicit expressions for the expanded NLL form factors are given in Eqs. (2.23) and (2.24). At LL the required expanded form factors are trivially F exp V,LL = F exp A,LL = 1. With these basic ingredients our matching procedure can be formulated schematically by the master formula with α s evaluated at the hard (µ H ), firm (µ F ) (the name "firm" signifying a scale in the geometric mean between the soft and hard scale), soft (µ S ) and ultra-soft scales (µ US ) By subtracting the expanded cross section defined with the firm coupling α F , which is also used for σ FO , we remove contributions that are simultaneously contained in the relativistic and nonrelativistic cross sections. The firm scale µ F is defined as the geometric mean between the soft and the hard scales. In the threshold region it provides a hybrid scale optimized for the hard and low-energy nonrelativistic higher order corrections (in the v expansion) contained in σ FO and not accounted for in σ full NRQCD . Away from the threshold region µ F is very close to the hard scale, which is an appropriate choice for the fixed-order expansion. From the fixed-order point of view, α F is therefore a reasonable choice with a safe IR behavior. 14 For comparison, we discuss the numerical impact of choosing α F over α H in Section 6.1. Returning to the matching, to avoid double counting we define the expanded cross section at the firm scale, and we maintain the canonical scales in σ full NRQCD at threshold according to Section 2. The switch-off function f s guarantees that we obtain only σ FO in the high-energy continuum. The exact contributions in σ FO , σ full NRQCD and σ expanded NRQCD depend on the order of interest and are discussed in detail in the next subsection.
We emphasize again that the master formula shown in Eq. (5.1) is schematic in order to illustrate the underlying principles of our matching procedure. In practice, we implement the matching at the amplitude level as described in more detail below, which leads to a matching formula that is substantially more involved than Eq. (5.1).

Contributions in the matched cross section
Our matching procedure is set up such that the cancellation of IR divergences between real and virtual QCD-NLO corrections is maintained and that the dominant interference terms between nonrelativistic resummed and non-resonant or background contributions are accounted for. We discuss all contributions in a diagrammatic way using the notation introduced in Section 3.6.1, omitting phase-space factors and the exact choice of parameters (which follow the guidelines described before). We emphasize that all shown components represent gauge-invariant contributions at the amplitude level, cf. Section 3.2.
We start by discussing the matching of fixed-order QCD-LO and LL threshold-resummed cross sections. As already mentioned in Section 3.3, the LL threshold-resummed cross section is obtained in the DPA according to Eq. (3.2), where the production matrix elements are multiplied with the corresponding nonrelativistic LL form factors F given in Eqs. (2.21) and (2.22). To this end, we defineF ≡ F − 1, i.e. all terms contained inF are O(α s ) and thus of higher order from the relativistic fixed-order point of view. In our notation we can rewrite the LL threshold cross section as where the "1" term corresponds to tree-level tt production without treshold resummation. It is now straightforward to include the full QCD-LO cross section by replacing the "1" terms by the full QCD-LO amplitude. Since the QCD-LO contributions do not contain α s corrections, no double-counting occurs at this point. The expanded contribution that we have to handle according to the scheme of Eq. (5.1), is represented by the sum of the first (square) and the second (interference) term (with factor 1) in Eq. (5.3). The fully matched QCD-LO+LL cross section then reads where the scale settings and the switch-off function f s are implemented as described in Eq. (5.1) and we employ the top width Γ t at QCD-LO. We note that the matched result takes into account the interference corrections involving the LL threshold resummed and the full QCD-LO amplitudes. The latter includes all tt-double-resonant, single-top and background diagrams. These interference corrections constitute an important part of the NLL electroweak corrections at threshold and are important to achieve top mass measurements with uncertainties of better than 50 MeV. The use of the DPA ensures that the amplitude phases are treated properly, cf. Section 3.5.4. As an intermediate next step let us now consider including in addition the full QCD-NLO cross section, σ NLO , as well as the NLL form factors, while still keeping the top and anti-top decays at QCD-LO in the factorized calculation. As σ NLO contains terms of O(α s ), for σ expanded NRQCD we now have to use the expansion of the form factors to O(α s ), given in Eq. (2.23) and (2.24). According to this, the matched cross section has the form where stillF NLL ≡ F NLL − 1, and we remind the reader that now F NLL − F exp NLL =F NLL due to the O(α s ) terms contained in F exp NLL . Although the expression in Eq. (5.5) appears straightforward in principle, it has a problem concerning making a consistent choice for the top quark width. From the point of view of the factorized computation, we are using a QCD-LO description of the top decays, so a consistent choice for the top width parameter appearing in the matrix element expressions for the F NLL 2 term as well as in the interference term is the QCD-LO width.
On the other hand, the Coulomb singular contribution to σ NLO , that is supposed to be subtracted by the F exp NLL term, requires a QCD-NLO width. Thus, also the subtraction term should be evaluated with the QCD-NLO width. The apparent conflict is an artifact of trying to match two computations that treat the top decay at different orders. This problem can only be resolved by incorporating the QCD-NLO decay also in the factorized parts of the cross section.
To proceed, we first observe that it is quite difficult to add any real and virtual corrections to the interference term in Eq. (5.5). This is because the IR divergences that arise when soft gluon corrections are added to the factorized amplitude and the full W + bW −b matrix element differ from each other. At the NLL precision we are aiming for, however, we do not have to consider these corrections because the interference term itself represents a NLL correction. Therefore, within the approximations adopted in this work, it only remains to add the QCD-NLO top decay corrections to the completely factorized F NLL 2 term in Eq. (5.5). We thus arrive at the following form of our final matching formula, which combines QCD-NLO fixed-order and NLL threshold cross section predictions, cf.

Switch-off function
As noted before, we need a definite way to switch off the resummations encoded in the form factors at center-of-mass energies away from the threshold region where the threshold resummation is not meaningful, and where we only want to use the relativistic fixed-order predictions. For this we define the switch-off function f s , which is a monotonic function of a √ s-dependent velocity parameter v s . It satisfies the basic requirements which means that f s (v s ) is unity in the threshold region and vanishes in the region where the relativistic fixed-order predictions are sufficient. Above threshold f s (v s ) should be monotonically decreasing and below threshold monotonically increasing. To this end we define the complex velocity, cf. Eq.
The dependence on the 1S mass is motivated to ensure that f s is unity in the peak region. Thus, we center the switch-off around the 1S mass and not around the pole mass. f s is a function of a real parameter, for which we may use the imaginary part, real part or modulus of v 1S as the argument. All three options are shown in Figure 12. We see that the real and imaginary parts of v 1S are roughly mutual reflections with respect to 2M 1S t with a slight asymmetry due to the +iΓ * t term in Eq. (5.8). To switch off the resummation for large values of √ s, it would be sufficient to only use the real part of v 1S . However, since our matched calculations cover also the kinematic region far below threshold where the fixed-order predictions are valid, we have to ensure that the threshold resummations are also switched off reliably in that region. It is therefore natural to define v s ≡ v 1S as the velocity parameters for the switch-off function f s . We note that due to the width -46 -  Figure 12. Switch-off function f s according to Eq. (5.9) together with the different choices for its argument: the absolute value, the real, and the imaginary part of the (quasi) velocity v 1S . In addition to the vertical line at 2M 1S t , we display the two matching parameters v 1 and v 2 used for the plot as horizontal lines.
term that enters the definition of the velocity parameter, v 1S adopts its minimal value Γ t /M 1S t ∼ 0.1 at √ s = 2M 1S t , so v s never vanishes. The concrete form of f s is in principle not very important as it has to be varied anyway to estimate the matching uncertainty, and as long as the essential parameters to control the switch-off behavior are implemented. We define these parameters as v 1 and v 2 , which satisfy 0 < v 1 < v 2 < 1, and impose the conditions f s (v s < v 1 ) = 1 and f s (v s > v 2 ) = 0. There are two more practical issues relevant for devising f s . Firstly, f s should be not only continuous but also continuously differentiable like any physical cross section. This property disqualifies simply using a linear interpolation for v 1 ≤ v s ≤ v 2 .
On the other hand, when using polynomials of high order for f s that at first switch off slowly but transition too quickly, one risks to introduce unphysical wiggles in the cross section when v 1 and v 2 are varied. After experimenting with different types of switch-off functions, cf. Appendix C, we decided to use the following polynomial interpolation, which satisfies continuity and continuous differentiability at both v s = v 1 and v s = v 2 : It represents a cubic Hermite interpolation that in the context of interpolations is known to -47 -belong to the family of smoothstep functions. For illustration, we plot f s in Figure 12 as a function of v s = |v 1S | (which is our default) as well as a function of the real and imaginary parts of v 1S . The matching parameters are set to v 1 = 0.1 and v 2 = 0.3, and displayed as horizontal lines. Their intersections with v s define start and end points of the transition region. From Figure 12 we can see that v 1 = 0.1 should be considered as the lower limit for reasonable v 1 values. Going lower would cut into the threshold region around 2M 1S t and thus artificially reduce the threshold peak. On the other hand, it is not obvious how to devise a meaningful upper limit of v 2 . Here, we consider values up to v 2 = 0.4, which leads to a fairly quick switch-off and to a reasonable behavior of the matched calculation. In any case, more concrete conclusions on this matter require the analysis of the matching procedure including corrections beyond NLL and QCD-NLO [45] which is beyond the scope of this paper.
Let us finally remark that it would in principle also be possible to implement the switch-off function by multiplying it with the NRQCD cross sections in Eq. (5.1) instead of the strong couplings. Both approaches differ by how the resummations contained in the form factors are treated. At the orders considered in our studies we found that multiplying the couplings leads to a somewhat smoother transition.

Theoretical uncertainties
For our examinations of the theory uncertainties we will perform variations of the different matching and renormalization scales µ H , µ F , µ S and µ US , cf. Eq. (5.2). We will vary coherently over the scale multipliers h and f , where h parametrizes the relative variation of the hard scale and f the relative variation of the subtraction velocity scale ν as discussed in Section 2.4. Since an entire scan over the h-f area defined in Section 2.4 is in practice not feasible for fully matched predictions, we sample the four corners of the h-f area, defining the set The envelope of the four associated curves is expected to give a good approximation of the result of a full scan, see also Section 4.2. Besides the scale variations, we also vary the matching (velocity) parameters (v 1 , v 2 ) in the interval [0.1, 0.4], while keeping v 2 ≈ v 1 + 0.2. In practice we will study four different (v 1 , v 2 ) matching parameter choices with v 1 ∈ {0.1, 0.15} and v 2 ∈ {0.3, 0.4}. Concerning the reliability of the obtained variation bands, we remind the reader that it has been shown in Ref. [11] that for the total cross section the naive NLL scale variation band does not envelope the NNLL prediction. 15 While for our matched predictions we also include additional electroweak and relativistic corrections, we have to assume that this also applies to our computation. Because the NLL scale variations for the threshold-resummed predictions are highly asymmetric with respect to the (h, f ) = (1, 1) central values in the region around the peak, cf. Figure 11, as a more conservative estimate of the theoretical uncertainties in this work we furthermore symmetrize the scale variations by computing the upper and lower error band envelopes. Hence, for each √ s value we are using the prescription where σ 0 ≡ σ(h = 1, f = 1) is the default cross section. The obtained bands are then by construction symmetric around the default cross section and are always enveloping and broadening the original error band. Note that we carry out this procedure for each (v 1 , v 2 ) matching parameter choice. To obtain our final uncertainty band for the fully matched QCD-NLO+NLL predictions, we take the envelope of all of these bands.
6 Inclusive results

Fixed-order results
In the left panel of Figure 13, we show QCD-NLO predictions for the total cross sections for W + bW −b (blue, orange, green) and on-shell tt (red) production as a function of √ s and the renormalization scale µ R . For the off-shell production, we show the effect of using the pole mass scheme (green, setting m t = M 1S t ) versus the 1S mass scheme (blue, orange, t ] using Eq. (2.34)) and of using different choices for the default scale µ 0 . The error bands shown arise from µ R variations with 0.5 ≤ µ R /µ 0 ≤ 2 using µ 0 = M 1S t (orange, green, red) and µ 0 = µ F = M 1S t √ ν * (blue). As expected, using the 1S mass scheme, which involves computing the pole mass as a function of M 1S t , leads to a visible energy shift of the cross section by around 2 GeV, indicating the size of the ground state toponium binding energy. The size of the scale variations is quite similar for both mass schemes. This is noteworthy, as we have varied h as well as f according to Eq. (5.10)both of which affect ∆M as shown in Eq. (2.34) -to obtain the results in the 1S mass scheme, where we have identified h with µ R /µ 0 . On the other hand, for the pole mass calculation, there is no f dependence and only the h variation can be performed. So we see that the f variation represents only a minor effect and does not lead to a larger scale variation for the full QCD-NLO cross section.
As already discussed in Section 5 the default renormalization scale choice for the QCD-NLO fixed-order cross section within our matched prediction is the firm scale µ 0 = µ F ≡ M 1S t √ ν * (blue), which is more sensitive to the threshold dynamics. The default scale yields slightly larger cross sections and scale variations in the threshold region than the hard renormalization scale µ 0 = M 1S t . To assess the impact of the off-shell effects at QCD-NLO, we also show the cross section for on-shell tt production. We see that on-shell tt production yields results that are 3 % to 4 % below the W + bW −b cross section for energies above ∼ 355 GeV. Concerning the highenergy behavior, where the deviations are dramatically larger because additional (not tt-   related) resonant channels for W + bW −b production open up, we refer to the examinations in Ref. [33]. Below 350 GeV, i.e. in the threshold region, the tt cross section rises rapidly over the W + bW −b cross section and even doubles the latter at the threshold point ( √ s = 2M 1S t , vertical dashed line). The behavior for the displayed energy range is slightly different from the one at QCD-LO discussed in the context of the purely factorized signal diagram in Section 3.5.2 where the cross section for tt was larger than the one for W + bW −b production in a much wider range.
In the right panel of Figure 13, we show the relative scale variations for fixed √ s = 350 GeV in the range 0.125 ≤ µ R /µ 0 ≤ 8. This is much wider than the default variation which is indicated by the vertical dashed lines. In this case, we set f = 1 and just varied µ R via h as defined above. The qualitative scale variation behavior is the same for all shown curves. Overall, the variations are slightly asymmetric with stronger dependence for lower µ R while within our default variation range the scale variation is linear to a good approximation. We emphasize that it is important that the QCD-NLO width is computed with the same renormalization scale that it is used in the QCD-NLO cross section computation to achieve a consistent normalization of total cross section results, see also Ref. [33].
-50 - We observe, that the fully matched cross section has the desired properties: In the threshold region (the ∼ 5 GeV window around √ s = 2M 1S t ) and in the continuum ( √ s 370 GeV) as well as the regions below threshold ( √ s 330 GeV) we recover the thresholdresummed and the relativistic fixed-order result, respectively. In the intermediate/matching regions, where the details of the matching procedure are relevant, we see a quite stable transition behavior. We note that our best prediction in the threshold region is not the pure QCD NLL threshold-resummed cross section (dash-dotted red), but the matched QCD-NLO+NLL version of Eq. (5.6) as it in addition accounts for off-shell-top, single-top as well as background contributions including their interference with the threshold-resummed matrix elements associated with the form factor. Phenomenologically, we observe that in the important peak region the matched cross section is 5-10% larger than the pure QCD NLL prediction, which shows the importance of the non-tt-resonant effects. Interestingly, these corrections agree in sign and also roughly in size with the known NNLL/NNLO corrections at threshold [4,5,10,11,36,51]. It is also conspicuous that the difference between the different choices of matching parameters is non-negligible. In the cross-over region √ s ≈ 348 GeV around the local minium above the peak (examined in more detail below) and also the shoulder region below the peak it exceeds by far the scale variations. We see that at the level of our approximation, renormalization scale variation alone is not sufficient to estimate the remaining theoretical uncertainties in these matching regions and that one has to also account for different viable choices of the matching parameters when estimating the remaining theoretical uncertainties. In addition to the lines that can be identified using the legend, we show the envelope (dark gray) as well as the symmetrized envelope (light gray) according to Section 5.3.
-52 -An interesting aspect of the threshold-resummed NLL calculations is that the renormalization scale variations are quite asymmetric with respect to the default scales, see Figure 11 and also the discussion in Ref. [11]. This behavior is therefore also present in our matched results and examined in more detail in Figure 15. Let us start the discussion with the left panel, where we show the individual behavior of the different (h, f ) scale choices with the switch-off function f s = 1, i.e. using no switch-off. We see that the maximal variation envelope is obtained by a nontrivial interplay of the different (h, f ) scale choices. For example, away from the threshold (either below or above) (h, f ) = (1/2, 2) provides the largest deviation from the default, while in the threshold region it is (h, f ) = (1/2, 1). We also see that in the threshold region the default result is close to the maximal result obtained for (h, f ) = (2, 1) visualizing once more the asymmetry mentioned above. We have also shown in light gray the symmetrized envelope that has been computed with the procedure outlined in Section 5.3 and which represents our (more conservative) theory error estimate.
In the right panel of Figure 15 we show the results with the same (h, f ) scale choices, but when the switch-off function f s is different from the identity, as described in Section 5.2. We see an interesting cross-over behavior which is a result of the opposite scale behavior of the NLL threshold-resummed calculation (which has a quite complicated shape), and the fixed-order QCD-NLO calculation in the continuum (which simply follows the scale behavior of α F ). The QCD-NLO scale variations are as one would naively expect and as we have already shown in Figure 13: For smaller (larger) renormalization scales (controlled by h), α s increases (decreases) and thus the cross section increases (decreases). Here the f variation represents only a very minor additional modification as it only results in changing the pole mass m t value for our approach to implement the 1S mass scheme, see Section 2.3. The NLL scale variations in the threshold region, however, are more involved and have a quite strong f dependence, which leads to an opposite behavior concerning the (h, f ) variations. This artificially creates a very small scale variation in a cross-over region, where the cross sections for all (h, f ) settings happen to agree at an energy that depends on the actual values of the matching parameters v 1 and v 2 , as is clearly visible in the right panel of Figure 15 at √ s = 350 GeV. Note that the symmetrization of the envelope barely affects the continuum region and also does not remove the cross-over behavior. It is therefore crucial to also account for variations of the matching parameters v 1 and v 2 to obtain a reasonable estimate of the theoretical uncertainties. This is finally shown in Figure 16, which displays the fully matched total W + bW −b production cross section at QCD-NLO+NLL including the full combination of renormalization scale and matching parameter variations (red). The result represents our best prediction for the W + bW −b production total cross section at QCD-NLO+NLL order and is valid in all kinematic regions. By performing the symmetrization procedure, we believe to have a reliable estimate of the theory uncertainty in the sense that the next order result with respect to QCD corrections is expected to have at least a substantial overlap with the current uncertainty band. Of course, this does not include known classes of large electroweak corrections, such as initial state radiation, which have to be considered on top of this. Parts of these effects will be discussed in the next section.

QED initial state radiation
Within the Whizard framework it is straightforward to combine our fully matched predictions for W + bW −b production with initial-state radiation (ISR), beamstrahlung and beam energy spread or to account for the polarization of the colliding electron-positron pair. Since ISR, beamstrahlung and beam energy spread involve a convolution of hard cross sections at collision energies √ŝ ≤ √ s, where using our fully matched predictions provide a substantially improved description, we will exemplarily only discuss the effects of ISR in the following. We leave the examination of beamstrahlung, beam energy spread and (the at leading electroweak order much simpler) polarization effects to future work and simulation studies of the experimental collaborations. While ISR is the QED all-order photon radiation in the initial state that substantially -54 -

Scale vars
NLO predictions for on-and off-shell tt production Figure 17. Inclusive cross section for the process, e + e − → W + bW −b at QCD-NLO. The 1S mass, M 1S t , has been used as central renormalization scale and the bands correspond to the usual (h) variations by a factor between 1/2 and 2. The blue and red band show the process with and without QED ISR, respectively. alters the energy of the hard interaction, beamstrahlung ist the classical coherent radiation from the highly collimated charge density clouds of the lepton collider beam bunches. These are steered in Whizard with the usual Sindarin syntax known from LO studies. In this section we demonstrate that our matching approach can be combined with a varying hard collision energy √ŝ ≤ √ s and we show how this affects the threshold and peak behavior. For the full QCD-NLO part of the matched cross section with polarization, we use the BLHA extension of the Whizard-OpenLoops interface, that allows to pass squared amplitudes from OpenLoops to Whizard exclusive in the helicity of the initial states and that has been validated in Ref. [33].
Though the more drastic effect on the threshold shape arises from the combination of ISR, beamstrahlung and beam energy spread, which together completely wash out the 1S peak (cf. e.g. Ref. [41]), a coherent analysis of these effects or of using polarization to achieve the reduction of background contributions is left to the experimental collaborations. We note that for example beamstrahlung effects that closely model the environment of the corresponding linear collider setup can be easily simulated with the Whizard subpackage Circe2. Linear collider beam spectra are available within the Whizard framework for the legacy TESLA project, for ILC, for CLIC, as well as for CEPC.
For the inclusion of ISR effects, we first compare in Figure 17 the full relativistic QCD-NLO fixed-order result for the off-shell (with respect to the top quarks) process e + e − → W + bW −b with (blue) and without ISR (red). The ISR effects have been included as usual in Whizard in the structure function approach in a completely collinear setup, resumming soft photons to all orders and hard-collinear photons up to third order in α em . Although the QCD corrections for the final state completely factorize with the initial state structure function convolution, the reduction of the effective energy √ŝ affects the different components differently and leads to a shift of phase space points from different energies, which can cause additional nontrivial effects. We see in Figure 17 that this is indeed so because ISR leads to the well-known overall relative reduction of the cross section, but at the same time also to a visible enhancement of the relative scale variation band.   In the left panel of Figure 18, we show the matched QCD-NLO+NLL cross section as a function of √ s with ISR and without switch-off (i.e. f s = 1 and referred to as "no switchoff"). Comparing to the analogous result without ISR in Figure 15 we see, as expected, that the overall cross section is reduced and that the peak is less pronounced, however still clearly visible. (Note that the peak is washed out only after beamstrahlung is included as well.) Again the ISR, which shifts events from higher energies to the threshold via the radiative return, has a nontrivial effect on the relative scale variation, which can be seen to be slightly smaller than for the case without ISR. This is just opposite to the way how ISR affects the uncertainties for the QCD-NLO cross section prediction as shown in Figure 17. In the right panel of Figure 18 we show as a final result the fully matched QCD-NLO+NLL (red) and QCD-NLO fixed-order (blue) total cross sections including QED ISR and symmetrized scale variations, and where also the switch-off function and the corre--56 -sponding variations of the matching parameters are applied. As a reference we also display the matched QCD-NLO+NLL cross section without switch-off function for the default scale setting (dotted black). Similar to the observations we made above, we again see that the convolution with the ISR structure function and the resulting radiative return effects lead to nontrivial relative modifications of the corresponding cross section predictions obtained without ISR effects and shown in Figure 16. We in particular observe that with ISR effects the fully matched QCD-NLO+NLL cross section fully merges into the QCD-NLO fixed-order prediction only for c.m. energies beyond 360 GeV, while this happens already above 353 GeV without ISR. This arises because the toponium peak enhancement of the fully matched calculation also has an effect for c.m. energies above the peak position. The results also show the importance of using the fully matched predictions in the intermediate region between the toponium peak and the continuum region above 360 GeV.
We note that a coherent study of ISR and beamstrahlung effects based on fully matched predictions will contribute to a more refined classification of which energy regions have to be considered pure continuum or pure threshold region. Such a detailed study is especially important for the planned 380 GeV stage of CLIC. This is, however, beyond the scope of this paper.

Differential results
After having validated and discussed the fully matched inclusive cross section results for e + e − → W + bW −b as a function of √ s, we can now start exploiting the full power of our MC implementation, namely by analyzing differential distributions. We note that the discussion of the differential distributions carried out here based on our fully matched approach is not intended to be exhaustive and mainly serves as a proof of principle. It is clear that a number of distributions provide alternative means of measuring the top mass, but we postpone a more systematic exploration of such possibilities to future work. We would also like to remind the reader that at the current level of implementation, final state interactions due to (ultra)soft gluon exchange involving top and anti-top decay products are not included beyond the level of the QCD-NLO corrections, as discussed in the text following Eq. (3.25). This means that in the threshold region the distributions do strictly have LL precision only and the uncertainties shown should be interpreted with a grain of salt, particularly for kinematic thresholds visible in distributions where soft gluon exchange involving top and anti-top and their decay products can play an important role. But we also note, that in many cases the missing NLL corrections may not represent significant contributions. For the analysis of the generated events, we use a custom Rivet [115] analysis. Partons are clustered with the generalized k T algorithm (ee genkt in FastJet) [116,117] with R = 0.4 and p = −1. A minimal jet energy of 1 GeV is required. We assume a perfect b-tagging efficiency including the charge. Thus a b-jet (b-jet), j b (jb), is a jet containing a b(b) quark. We always require at least two jets during the analysis. For distributions of observables that are identical for t ↔t, b ↔b and W + ↔ W − , we only show results for t, b and W + , respectively. If not stated otherwise, the results are obtained at √ s = 2M 1S t = 344 GeV.
-57 -Keep in mind that this is slightly below the kinematical threshold √ s = 2m t , thus the preferred kinematical Born level situation, as far as tt production is concerned, is one with one on-shell and one slighly off-shell top propagator. We stress that the uncertainty bands shown in this analysis correspond to the scale variations as described in Eq. (5.10) only, i.e. they have not been symmetrized as described in Section 5.3. The indicated darker solid lines in the plots correspond to our default scale choice h = f = 1, so the effect of the symmetrization can be easily seen from the results we show. We start the discussion of differential distributions with a few classic top observables, which can already be defined for the on-shell e + e − → tt process. In Figs. 19 to 21, we show the top and anti-top polar angle as well as the top energy, invariant mass, three-momentum and transverse momentum distributions using the fully matched QCD-NLO+NLL (red) and the QCD-NLO fixed order calculations (blue). In the following discussion we also refer to them simply as σ matched and σ NLO , respectively.

Top quark observables
The polar angle distribution shown in Figure 19 is fairly flat already at NLO. This is expected, as the forward-backward asymmetry for top-pair production at lepton colliders is quite small at threshold [107] due to the v 2 -suppression of the P-wave contribution. Note that σ matched /σ NLO shows a slight slope opposite to the polar angle distribution and thus flattens the distribution even further. The energy of the reconstructed top quark E W + j b displayed in the left panel of Figure 20 peaks strongly in the region close to m t , corresponding to resonant (close to mass shell) top quarks with no velocity. The fully matched description enhances this peak by a factor of ∼ 14, while adding very little to the off-shell -58 -     (concerning the top quarks) configurations already present in the fixed-order calculation. The invariant mass of the W + -b-jet system m W + j b shows a very similar behavior. As we are including all irreducible backgrounds to W + bW −b to QCD-NLO, there are still contributions for ∆M > 30 GeV at the per cent level from σ NLO in σ matched . At this point, we remind the reader that σ matched − σ NLO contains, apart from the interference terms, only double-top propagator contributions according to Eq. (5.6). Thus, it corresponds approximately to a Breit-Wigner distribution, which falls off quicker than σ NLO , especially for larger m W + j b as seen in Figure 20. Finally, in Figure 21, left panel, we show the distribution of the reconstructed top three-momentum p W + j b , which is known to be a key observable in understanding the dynamics at threshold [13,15]. As expected, low three- σ/σ NLO Figure 22. Invariant mass distribution of reconstructed top quarks close to the mass peak with fine binning. Lines, bands and panels as in Figure 19.
momenta are preferred both in σ NLO and in σ matched , but we observe a strong enhancement of low momenta due to the threshold resummation, leading to an enhancement of a factor of over ∼ 17 below 20 GeV that flattens to a factor below 2 above 70 GeV. The projection to the transverse plane results in a very similar distribution in p W + j b T , which we show in the right panel of Figure 21. As noted earlier, we omit the histograms for E, m, p and p T of the W − jb system, as they are nearly identical to their W + j b counterparts.
In Figure 22, we show a more finely binned distribution of the reconstructed top invariant mass m W + j b . It peaks in the bin 172 GeV to 173 GeV, which is in the vicinity to, but below the pole mass m t = 173.124 GeV, indicating a shift of the visible physical mass peak due to QCD effects compatible with observations made in Ref. [118][119][120] for boosted top quarks. It is interesting to see, though, that σ matched /σ NLO is maximal slightly below the peak in the 170 GeV to 171 GeV bin. This is related to the threshold resummation which entails that the dominant kinematic configuration is associated with top quark decays emerging from the would-be toponium resonance, which implies two slightly off-shell tops. At the peak of the m W + j b distribution, in contrast, one top quark propagator is predominantly on-shell and the other one is slightly below the top-mass shell.

Top decay product observables
In Figure 23, we show the rapidity and azimuthal angle differences between b-jets and W + bosons. These tell us a lot about the kinematics of the top decay and the underlying background. In the rapidity difference, we observe already in the QCD-NLO fixed-order results a peak around ∆R W + j b = 3. This is quite different from the situation at high energies, where for √ s ∼ 800 GeV [33] a rather low rapidity separation of about 1 is favored. Obviously, this is related to the boost of the top decay products. At threshold, the tops have preferably low three momenta p W + j b , cf. the left plot in Figure 21. Thus, the back-to-back decay in the top rest frame remains essentially unboosted in the lab frame and W + and j b move in opposite directions. On the other hand, at high energies W + and j b will be boosted in the same direction and thus move preferably in a cone around the original top momentum, leading to a smaller average ∆R. Going to the matched results, we see that the threshold resummations lead to an almost constant enhancement factor of the ∆R ≥ 3 regime, while barely enhancing the QCD-NLO results for ∆R ≤ 2. Thus, we can conclude that the events populating ∆R ≤ 2 represent dominantly (background) W + bW −b production not associated to top production. Finally, we note that in the azimuthal angle difference the same physics is reflected. Here, we can see a preferred angle separation of ∆Φ = π, as expected at threshold. For comparison, for √ s ∼ 800 GeV [33] a value of ∼ π/4 is favored. Also in this case, the matched (and resummed) results enhance the pure top-decay topology. Compared to the R separation, there is no jump in σ matched /σ NLO , though, but a continuous increase with larger angles. While it is fairly obvious that rapidity and azimuthal angle difference between W + and b-jet will carry information about the top decay, it is interesting to see whether even single final state distributions of b-jet or W + carry similar information. In Figure 24    been proposed as a possibility to measure the top quark mass [121][122][123], which has been realized by CMS using 8 TeV data [124]. We note that the peak position is consistent with the (Born-level) rest frame energy (cf. Eq. (3.9) for m b m W ), as it has been shown for unpolarized top decays, massless b-quarks and generic boost directions in Ref. [121]. In our case, this is of course especially expected as nearly no boost of the top decay is present. The intriguing aspect of this analysis is that no correct reconstruction of b-jets with W + has to be performed and even the charge of the b-jets is irrelevant. Considering pairs of b andb-jets as shown in the right panel of Figure 24, we observe that the peak in E j b around 70 GeV is translated to a peak in E j b j b around 140 GeV. We stress, however, that these results have to be interpreted with some caution as we have neglected NLL order final state interactions involving b andb, which could affect particularly this observable.
In Figure 25, we show the transverse momentum of b-jets (left panel) and W + bosons (right panel). As we know that the b-jet energy peaks around 70 GeV, we can expect p j b T to have its maximum slightly below this value due to the small bottom mass m b = 4.2 GeV. As a result of momentum conservation, p W + T has to follow a similar distribution. This is exactly what we observe in Figure 25. Compared to the b-jet energy, the peak is not as pronounced and more smeared to smaller values, which can still correspond to the peak jet energy due to the projection to the transverse plane. Accordingly, σ matched /σ NLO is large, a factor of 6-13, for momenta below 70 GeV and quickly drops to 1 above 90 GeV, where the contributions largely stem from the W + bW −b (background) production not associated to top quark decays.
Finally, we show in Figure 26 the energy of W + bosons. Also in this distribution, we can identify the footprint of the top decay in the peak and large threshold enhancement -62 - Lines, bands and panels as in Figure 19. σ/σ NLO Figure 26. Energy distribution of W + bosons. Lines, bands and panels as in Figure 19.
around the 100 GeV to 106 GeV bin. Compared to Figure 24 though, we observe even for large W + boson energies still sizable threshold enhancements of a factor of ∼ 2. Thus, the top quark contributions are not as localized as in the E j b case.

Summary & conclusions
The center-of-mass energy scan over the lineshape of the onset of top-pair production at a future lepton collider represents the most precise known experimental method to determine the top quark mass. Due to the high level of theoretical understanding this method allows to directly access a short-distance top-quark mass that is theoretically much better defined than the so-called pole mass and yields theoretical as well as experimental uncertainties well below 0.1 GeV. Thus, from the point of view of theoretical control as well as precision, the top-pair threshold scan is superior to top mass determinations from kinematic reconstruction. In order to precisely assess the experimental systematics of the threshold scan as well as to study kinematical distributions in the threshold region e.g. as an alternative means to measure the top mass, an exclusive calculation fully differential in the final state is indispensable.
In this paper, we have for the first time set up a fully exclusive framework which allows to access all aspects of W + bW −b production, is valid in the threshold region as well as away from threshold, and provides smooth and continuous predictions over the whole energy range. It is thus suitable without restrictions for any conceivable staging plan for a future lepton collider. Our approach combines at the amplitude level NLO fixed-order relativistic continuum QCD calculations with the NLL resummation of the threshold corrections determined from a renormalization group improved extension of NRQCD. The approach is constructed such that in the threshold region the threshold-resummed predictions dominate, while far away from the threshold the predictions are given only by the fixed-order calculations. In order to make this feasible and particularly to avoid double counting in the intermediate cross-over regions, the tt signal diagrams containing the resummed nonrelativistic S and P wave form factors have been evaluated in a factorized and manifestly gauge-invariant double-pole approximation. The resummed form factors for both have been expanded to first order in α s and subtracted at the amplitude level from the relativistic fixed-order matrix elements to remove double-counting contributions. Our approach accounts in particular for the interference contributions that arise from thresholdresummed and fully relativistic tt resonant as well as non-resonant matrix elements. For the factorized parts of our approach the top quark decays have been evaluated at NLO in QCD.
On the technical side, the NLL threshold resummations for the form factors were realized by incorporating the numerical Toppik code into the MC framework of Whizard which already provides automated NLO QCD fixed-order predictions for multi-leg processes. To achieve a gauge-invariant description in the context of the matrix element evaluations in the factorized contributions, an on-shell projection regarding intermediate top-and anti-top quark states for the matrix element evalutation has been performed. This allows to keep the resummed contributions in the MC framework in the kinematic regions above threshold, as well as at and below threshold. The implementation has been thoroughly tested against existing analytical threshold calculations. Furthermore consistency checks concerning gauge invariance, the high-energy behavior, and spin correlations have been performed.
-64 -As the factorized contributions containing the form factor with the threshold-resummed Coulomb-singular and logarithmic terms (and their expansions) do not provide a valid description in the relativistic regime 5-10 GeV away from the toponium peak region, those terms have to be switched off smoothly in order to transition to the pure NLO QCD relativistic description in the continuum. This has been achieved by supplementing these contributions with a smooth switch-off function which is unity in the threshold region and which vanishes away from threshold. As the form of the switch-off function is not unique, variations of it have to be performed to determine the theoretical uncertainties in the intermediate region between threshold and relativistic continuum. So, to provide a meaningful description of the overall remaining theoretical uncertainties, common scale variations in the relativistic calculation, a scan over soft and hard scaling parameters of the thresholdresummed form factors, as well as a variation over the parameters of the switch-off function have to be carried out. As the scale variations from the nonrelativistic scaling parameters are asymmetric with respect to the default values at the current level of our approximations, we in addition suggest to symmetrize the corresponding error bands adopting the largest deviation from the default as the uncertainty to arrive at a conservative error estimate.
In our current implementation there are a number of NLL effects in the threshold region which are not yet implemented and which shall be addressed in future releases. These include contributions such as the Coulomb potential due to photon exchange, the resummation of phase space logarithms, a more systematic implementation of top quark short-distance mass schemes that allows to simultaneously use the 1S mass scheme in the threshold region and the MS scheme at high energies, and final state interactions due to dynamical soft gluon exchange which are known to cancel in the total inclusive cross section and cannot be described by form factors. The latter effects are essential for differential observables affected by gluon exchange involving top and anti-top and their decay products, and such differential results therefore only have LL precision in the threshold regime in our current implementation. We also mention that our current implementation lacks a coherent full description of the top spin correlations due to current limitations of OpenLoops. An elaborate (but possible) next step is also the inclusion and proper matching of purely electroweak NLO corrections and a coherent treatment of initial-state photon radiation which at this level does no longer simply factorize with the top production process. It is also straightforward to incorporate the available higher-order total inclusive cross section results in the normalization via a K-factor approach.
The obtained differential results at the tt threshold have demonstrated that it is possible to clearly separate and identify areas of the W + bW −b phase space which are tt-resonant, non-tt-resonant or intermediate. These results may provide additional means to measure top quark parameters such as its mass, width, and couplings.
More systematic studies related to these observables are beyond the scope of this work. We finally remark that it is straightforward to apply our implementation of the combined threshold-resummed and fixed-order relativistic description of the top pair threshold to other processes where the nonrelativistic tt dynamics is essential. Well-known examples of such processes are ttH or ttZ production in the region close to threshold. As the Higgs and Z bosons to a good approximation act as colorless recoilers, these processes can be -65 -also described using the approach detailed in this paper. We also remark that the same techniques can also be applied to pair thresholds arising from heavy colored resonances beyond the Standard Model at hadron colliders such as the LHC.
Given the power of the method proposed in this work to coherently incorporate and combine available results at threshold and in the relativistic region and to access and study arbitrary observables at the fully differential level, we believe that our MC approach represents a highly suitable framework to make further progress in top threshold physics.
and c i (h, 1) is the one-loop matching condition given in Eqs. (2.15) and (2.16). For the 3 S 1 current coefficient c NLL 1 we have [10,54] A (1) We note that in Ref. [58] the NLL anomalous dimensions of color singlet heavy quark-antiquark production currents for all possible 2S+1 L J quantum numbers were determined.

B Additional validation results
Completing the validation for the inclusive cross section with cuts on the reconstructed invariant mass of the tops in Section 4.2, we show in Figure 27 the corresponding validation plots for a (tight) cut, ∆ mt = 15 GeV.

C Alternative switch-off functions
Here we briefly discuss other possibilities for the switch-off function f s and compare them to the smoothstep function in Eq. (5.9), which we use for the matched predictions in this work. Specifically, we depict in Figure 28  observed that high curvature functions typically lead to artificial bumps or wiggles in the matched cross section. Those are absent for the linear switch-off, which on the other hand is not smoothly differentiable and produces unphysical edges at v 1 and v 2 . The quadratic f s we have displayed here actually consists of two quadratic functions: This is quite close to the smoothstep, but further away from the linear function. The Fermi-Dirac distribution in Figure 28 has been generated with a mean of (v 1 + v 2 )/2 and a width of (v 2 − v 1 )/20. Note that while one can get a behavior closer to the linear function around the mean with a larger width, this leads to f s not being approximately 1 and 0 at v 1 and v 2 , respectively. Overall, our smoothstep function appears to be a good compromise between smoothness and little curvature, while most other (reasonable) parametrizations give cross section results within the matching uncertainty from varying v 1 and v 2 as described in Section 5.3.