The Strong Force meets the Dark Sector: a robust estimate of QCD uncertainties for anti-matter dark matter searches

In dark-matter annihilation channels to hadronic final states, stable particles -- such as positrons, photons, antiprotons, and antineutrinos -- are produced via complex sequences of phenomena including QED/QCD radiation, hadronisation, and hadron decays. These processes are normally modelled by Monte Carlo event generators whose limited accuracy imply intrinsic QCD uncertainties on the predictions for indirect-detection experiments like Fermi-LAT, Pamela, IceCube or AMS-02. In this article, we perform a complete analysis of QCD uncertainties in antimatter spectra from dark-matter annihilation, based on parametric variations of the Pythia 8 event generator. After performing several retunings of light-quark fragmentation functions, we define a set of variations that span a conservative estimate of the QCD uncertainties. We estimate the effects on antimatter spectra for various annihilation channels and final-state particle species, and discuss their impact on fitted values for the dark-matter mass and thermally-averaged annihilation cross section. We find dramatic impacts which can go up to $\mathcal{O}(40)$ GeV for uncertainties on the dark-matter mass and up to $\mathcal{O}(10\%)$ for the annihilation cross section. We provide the spectra in tabulated form including QCD uncertainties and code snippets to perform fast dark-matter fits, in this https://github.com/ajueid/qcd-dm.github.io.git repository.

While the statistical uncertainties on the antiproton flux are now very small, a proper treatment of systematic uncertainties and their correlations can be very important in DM analyses [35][36][37][38][39]. It was pointed out that proper treatment of systematic errors may not only reduce the antiproton excess but can even completely exclude it [40]. On the other hand, the AMS-02 collaboration has released measurements of both the e + /e − ratio and the positron flux that both pointed toward an excess with respect to the astrophysical backgrounds in the region 10-300 GeV [41] which is consistent with the previous measurements but with smaller error bars. The fact that the observed spectra are not expected in astrophysics has prompted several explanations including DM (see e.g. [42][43][44] for details about the different DM possibilities).
For DM masses above a few GeV, QCD jet fragmentation can be the leading source of antimatter production in DM annihilation. A large number of antineutrinos, positrons and 1 In fact, an excess over the astrophysical predictions was reported right after the first evidence for the existence of antiprotons in 1979 [5] (further confirmation was done in 1981 [6]). An attempt to explain this excess was performed shortly after this discovery wherein a massive photino DM with mass of mγ ∼ 3 GeV can reproduce both the correct relic density and the total antiproton flux [7,8].
antiprotons can be produced from complex sequences of physical processes that can include resonance decays (if the DM annihilate to intermediate resonances like W/Z/H bosons or the top quark), QED and QCD bremsstrahlung, hadronisation, and hadron decays. The modeling of hadronisation in particle production from DM annihilation is usually done using multi-purpose Monte Carlo (MC) event generators [45] which are either based on the string [46,47] or the cluster [48,49] models. This is due to the fact that hadronisation cannot be solved from first principles in QCD but only using phenomenological models typically involving several free parameters. Proper estimates of QCD uncertainties that stem from hadronisation are seldom rigorously addressed in DM literature. We note that comparisons between different MC event generators such as Herwig and Pythia have been done in [50,51]. Using gamma rays as an example, it was found that the differences in particle spectra predicted in different MC event generators can be observed in the tails of the spectra while there is a high level of agreement between them in the bulk of the spectra [51]. This finding has been confirmed in a previous study where we have used the most recent and widely used MC event generators [52]. This level of agreement is due to the fact that the default parameter sets for the MC models are being optimised to essentially provide "central" fits to roughly the same set of constraining data, comprised mostly of LEP measurements [53][54][55][56][57][58][59][60][61][62]. Therefore, the envelope of different MC event generators does not reliably span the theory uncertainty allowed by data in the bulk of the spectra while it can overestimate the uncertainty in both the high-and low-energy tails of the spectra.
To address this issue, we have provided for the first time a conservative estimate of QCD uncertainties within Pythia 8 on gamma-ray [52] and antiproton [63] production from DM annihilation (see also [64][65][66] for short summaries of these studies). In the present work, we aim of to complete these studies by performing a comprehensive analysis of the QCD uncertainties on antimatter production from DM annihilation.
We first revisit the constraints from LEP measurements on the parameters of the Lund fragmentation function while this time thoroughly discussing the differences between the various measurements of baryon spectra at the Z-pole. We then perform several (re)tunings based on the baseline Monash tune [58] of Pythia 8.244 event generator [67]. In this paper, we perform for the first time a Bayesian analysis of the fragmentation-function parameters, finding very good agreement with the results of the frequentist fit. We then estimate the various QCD uncertainties both connected to parton-shower modeling and hadronisation. This paper provides also a first unified model for the production of gamma rays, positrons, antineutrinos and antiprotons in DM annihilation within Pythia 8. The uncertainties are found to be important and range from a few percent to about 50% depending on particle species, DM mass, annihilation channel, and particle energy. Therefore, we recommend DM groups to use the results of this work in their analyses 2 .
The remainder of this paper is organised as follows. In section 2, we discuss the physics modeling of antimatter production in a generic DM annihilation process. The last part of section 2 is essential to determine the relevant constraining data at LEP. In section 3, we discuss the relevant experimental measurements of baryon spectra at LEP and the consistency between the theory predictions and experimental measurements using three state-of-art MC event generators. The fitting procedure of this work is discussed in section 4. In section 5 we discuss the results of the various tunings. A comprehensive discussion of the different types of uncertainties and their estimates is done in section 6 where we also study quantitatively the impact on the energy spectra for a few selected DM masses, annihilation channels and particle species. Section 7 is devoted to the impact of QCD uncertainties on DM indirect detection experiments for the spectra of antiprotons, electron antineutrinos and photons. We draw our conclusions in section 8.
2 Antimatter from dark-matter annihilation

General features of stable antiparticle production in Pythia 8
To properly assess the QCD uncertainties on antimatter spectra in DM indirect searches, we first study their production in a generic annihilation or decay process of a dark-matter candidate with mass in the GeV-TeV range. For this, let us consider the following generic annihilation process 3 (2.1) We have assumed the narrow-width approximation which enables us to factorise the process into a production part χχ → N i=1 X i and a decay part X i → a i k=1 Y ik . In equation 2.1, X i refers to any parton-level SM particle which could be a resonance such as the Higgs boson, W/Z-bosons, or the top quark or a non-resonant state like gluons or light quarks.
The X i states are assumed to produce a i states, through X i → a i k=1 Y ik , after a complex sequence of processes such as QED bremsstrahlung, QCD parton showering, hadronisation and hadron decays. The produced antiparticles are, therefore, part of this final state and can be detected in indirect detection experiments. Unlike the case of gamma rays, both the total rate as well as the shape of the antiparticle spectra are slightly affected by QED bremsstrahlung. This process occurs if X i or Y i contains electrically charged particles or photons and will lead to production of additional photons and/or charged fermions. Besides QED bremsstrahlung, coloured fermions produced in DM annihilation, either promptly or through the decay of heavy resonances, will undergo QCD bremsstrahlung wherein additional coloured particles are produced. This phenomenon is characterised by an enhancement of probabilities for emissions of soft and/or collinear gluons, with the latter being suppressed if the produced fermions are heavy. Furthermore, the rates of g → qq are enhanced at low gluon virtualities: Q 2 = (p q + pq) 2 → 0. The rate of QCD radiation processes is mainly controlled by the effective value of the strong coupling constant α S which is evaluated at a scale proportional to the shower evolution variable 4 . We note that the value of α S (M Z ) in Pythia 8 is larger than α S (M Z ) MS for two reasons: (i) the so-called Catani-Webber-Marchesini (CMW) scheme involves a set of universal corrections in the soft limit [71] which is equivalent to a net 10% increase in the value of α S (M Z ) and (ii) an agreement between data and theory in experimental measurements of 3-jets observables in e + e − collisions at LEP energies is reached if α S (M Z ) is increased by a further ∼ 10% [55,58].
Finally, at a scale Q IR O(1) GeV, any coloured particle must hadronise to produce a set of colourless hadrons. This process, called fragmentation is modeled within Pythia 8 with the Lund string model [47,72,73]. The longitudinal part of the description of the hadronisation process is given by the left-right symmetric fragmentation function, f (z), which gives the probability for a hadron to take a fraction z ∈ [0, 1] of the remaining energy at each step of the (iterative) string fragmentation process. The general form can be written as where N is a normalisation constant that guarantees the distribution to be normalised to unit integral, and m ⊥h = m 2 h + p 2 ⊥h is called the "transverse mass", with m h the mass of the produced hadron and p ⊥h its momentum transverse to the string direction, and a and b are tunable parameters which are denoted in Pythia 8 by StringZ:aLund and StringZ:bLund respectively. As was pointed in a previous work [52], the a and b parameters are highly correlated (in the context of fits to measurements) since the former controls the high-z tail, while the latter mainly controls the low-z one, while most of the data is sensitive to the average z which is given by a combination of the two which does not have a simple analytical expression. A new reparametrisation of the fragmentation function exists for which the b parameter is replaced by z ρ which represents the average longitudinal momentum fraction taken by mainly the ρ mesons, i.e.
This equation is solved numerically for b at the initialisation stage (which requires that 4 In Pythia 8, the shower evolution scale is the transverse momentum of the branching parton. the option StringZ:deriveBLund = on) where the following parameters: are used.
In the string-fragmentation picture, baryons are produced similarly to mesons, by allowing the breaking of strings by the production of diquarks-antidiquark pairs; these can be thought as bound states of two quarks (in an antiduqark) or two antiquarks (in a diquark). This basic picture entails a very strong (anti)correlation of the produced baryons in both flavour and phase space, due to the fact that a baryon originating from a diquark produced in a string breaking is associated with an anti-baryon as the new end-point of the residual string. Experimental measurements of Λ 0Λ0 correlations by the Opal collaboration [74] do not find such strong correlations. To address this, the socalled pop-corn mechanism was suggested [75,76], in which baryons are produced such that the string breaking occurs with the production of one or more qq pairs "in between" the diquark-antidiquark pair. This picture enables the production of one or more mesons between two baryons and therefore decrease the correlations between them. Note that in the context of DM indirect searches, the correlation between baryons is not relevant as the produced protons travel for long distances before they reach the detector. In Pythia8, baryon production is controlled by an additional parameter denoted by a Diquark such that the a parameter in f (z) is modified as a → a + a Diquark . The extra parameter relevant for baryon production is denoted in Pythia 8 by StringZ:aExtraDiquark.

The origin of positrons, antineutrinos and antiprotons in a generic darkmatter annihilation process
After discussing the general features of antiparticle production in a generic dark-matter annihilation process in Pythia 8, we turn into a detailed investigation of the origin of these particle species. For this task, we consider a generic dark matter with mass of 1000 GeV and focus on four annihilation channels: qq : q = u, d, s, b, tt, V V : V = W, Z and HH. The reason is that at this mass value, the universality of the fragmentation function implies that all these annihilation channels have approximately the same features. In the following we assume that σ(χχ → W W ) = σ(χχ → ZZ) without any loss of generality.

Antineutrinos
We start with the spectra of antineutrinos and we split them into electron antineutrinos (ν e ), muon antineutrinos (ν µ ) and tau antineutrinos (ν τ ) and they are shown in figures 1-3.
First, most of theν e are coming from the decay of µ ± in µ − → e −ν e ν µ . The contribution of the muons to the rate ofν e is 90% irrespective of the annihilation channel. From the left to the right we show the qq, tt, V V and HH channels. One showsν e produced from the decay of muons (red), K L (turquoise), n,n (olive), D 0 (dark blue), D + (violet), B 0 (hot pink), Z 0 (light blue) and W ± (cyan). Here,ν µ are produced from the decay of muons (red), π ± (turquoise), D 0 (olive), B + (dark blue), Z 0 (purple) and W ± (hot pink). This is followed by the contribution of (anti)-neutrons through n → pe −ν e which is about 4%-5% depending on the annihilation channel. The other contributions are small (below 2%); one note among others K L , D 0 , D + , B 0 , W − → e −ν e and Z 0 → ν eνe . The last two contributions (Z → ν e and W + → ν e ) are possible in the tt, V V and HH channels. We note that most of the muons that leads to theν e are coming from the decays of charged Tau antineutrinos pions, i.e., π − → µ −ν µ . Charged pions are produced in abundance through fragmentation of quarks/gluons and/or decay of heavier hadrons [52]. Therefore, one can find a direct connection between the modeling of gamma rays and electron antineutrinos.
The situation is slightly different forν µ where can see that both µ + and π + contribute to about 47-50% of the total rate while the other sources give negligible contributions (see figure 2). Similar toν e , most of the muons that give rise toν µ are coming from the decay of π + . Finally, one note thatν τ have extremely different type of sources -the total rate ofν τ is negligibly small as compared toν e andν µ -.ν τ are produced from τ + , D 0 s , B 0 s , B + , B 0 , W ± and Z 0 with average contributions of about 1%-50% depending on the annihilation channel (see figure 3).

Positrons
In figure 4, we show the mean contribution of different particles to the rate of e + in generic DM annihilation for DM mass of 1000 GeV. We can see that, similarly toν e (see figure 1), most of e + are produced from the decay of µ + independently of the annihilation channel.
The other contributions are similar to the case ofν e production.

Antiprotons
In general (anti-)protons can be split into two categories: (i) primary (anti-)protons produced directly from the string fragmentation of quarks and gluons and (ii) secondary (anti-)protons produced from the decay of heavier baryons. In figure 5, we display the mean contributions to the (anti-)proton yields in DM annihilation for M χ = 1000 GeV. We can see that about 22% of the produced (anti)protons are emerging from q/g-fragmentation. Here, e + are produced from the decay of muons (red), K L (turquoise), n (olive), D 0 (dark blue), D + (purple), B 0 (hot pink), Z 0 (light blue) and W ± (cyan). On the other hand, the dominant large fraction ( 50%) of (anti)-protons is originated from the decay of (anti-)neutrons in decaysn →pe + ν e as can be clearly seen in figure 5. This is followed by the contribution of five baryons which mainly contribute to secondary protons: Λ 0 , Σ + and ∆(1232) -summing contributions from ∆ + , ∆ 0 and ∆ ++ -contribute to about 2%-7% of the produced (anti)-protons respectively.
We conclude that the spectra of antiparticles are correlated to each other with the dominant hadrons to be modeled properly are charged pions, protons and Λ 0 baryons. We note that in ref. [52], we have discussed in detail the modeling of pion spectra at LEP and therefore we will not discuss it anymore here (although we do include these measurements in our fits). In the next section, we discuss in detail the measurements of baryon spectra at LEP and assess the agreement between the corresponding theory predictions and experimental measurements for three multi-purpose MC event generators.

Introduction
From the discussion in section 2, it is clear that the modeling of the spectra of antiprotons will be improved if one includes all the relevant measurements of proton spectrum performed at LEP. Besides the measurements of the proton spectrum itself, one may expect some improvements from measurements of the spectra of the following baryons: • Λ 0 : Λ 0 is the dominant source of secondary protons at LEP (about 22% of the total protons are coming from Λ 0 baryons). The mean multiplicity of Λ 0 was measured by several collaborations at LEP: n Λ 0 = 0.357 ± 0.017 [77], n Λ 0 = 0.348 ± 0.013 [78] performed in 1993 and 1996 respectively. We note that Λ 0 baryons decay with 63.8% branching ratios into pπ − [79] and therefore we expect a strong correlation between the scaled momentum of Λ 0 and of p/p since most of Λ 0 baryons are reconstructed using tracks identified with charged pions and protons.
• ∆ ++ : About 11% of protons at the Z-pole are produced from the decay of the ∆ ++ which decays with 100% branching ratio into pπ. Given that ∆ ++ and p are members of the same multiplet, we may expect important constraints on p/p due to isospin.
However, there are only two measurements of ∆ ++ performed by Delphi [80] and Opal [81]. These measurements suffer from large uncertainties due to the difficulty in isolating the ∆ ++ signal from the overwhelming backgrounds. We, therefore, do not include these measurements in the fits.
• Σ ± : About 5% of protons at LEP are coming from the decays of Σ + . The corresponding branching ratio is BR(Σ + → pπ 0 ) = 51.57% [79]. There are two measurements of Σ + scaled momentum at LEP by Delphi [82] and Opal [83] collaborations. These measurements will not be used in the fit for the same arguments we used to exclude ∆ ++ measurements. Therefore, the main constraining observables in this study will consist of a set of measurements of Λ and p/p energy-momentum distributions. To guarantee a good agreement with the results of the previous study [52], we also include measurements of meson spectra, event shapes and particle multiplicities. Before going into a discussion of the setup used in this study, we discuss briefly the various measurements performed by LEP at the Z-pole using data collected between 1992 and 1999 5 .

Note about the relevant measurements
ALEPH (1996)(1997)(1998)(1999)(2000). The Aleph collaboration has reported on three measurements of Λ scaled momentum -x p in [84] and ξ in [88] -and one measurement of p/p scaled momentum [84]. The first measurements rely on the initial data which contain 520,000 inclusive hadronic events [84]. In a second paper published in 2000, the Aleph collaboration used a more complete dataset consisting of 3.7 million hadronic events [88]. The measurement of identified particle spectra by Aleph can be made by a simultaneous measurement of the hadronic momentum and ionisation energy loss dE/dx in the Time Projection Chamber (TPC). There are holes in the reported measurement of p/p spectrum in the regions 8,7] × 10 −2 due to the strong overlap between the bands of p/p and of the other hadrons (π ± , K ± ). Most of the systematic uncertainties can be roughly categorised into three components: track reconstruction efficiencies, efficiencies in the Λ 0 reconstruction and the background calibration. These errors have been corrected for by using the predictions of JetSet [90]. We note that these correction factors can be large in some kinematics regions.
DELPHI (1993)(1994)(1995)(1996)(1997)(1998). The Delphi collaboration has performed a detailed analysis of the scaled momentum of Λ 0 and p/p using data collected in the period of 1991-1998 [77,85,86]. In a first measurement of Λ 0 spectrum and Λ 0 -Λ 0 correlations, 993,000 hadronic events were used [77]. In [85], a measurement of proton momentum for x p ∈ [0.03, 0.1] has been carried using 17000 hadronic events. This measurement has been superseded by a more recent one which relied on a larger statistical sample consisting of 1,400,000 hadronic events and covering a wider range of proton momenta, i.e. x p ∈ [1.53 × 10 −2 , 1] [86]. Contrarily to Aleph, the reconstruction of the particle momenta in Delphi is based on the measurement of the ionization angle in the Ring Imaging CHerenkov (RICH) detector 6 . A number of selection cuts have been applied to improve the quality of particle 5 The measurements reported on by the experimental collaborations at LEP of the Λ or p/p spectra correspond the measured variables being either xp = |p hadron |/|p beam | (Aleph [84] and Delphi [77,85,86] [85,86] and Opal [89]). 6 Note that the Cherenkov angle has a dependence on the particle mass and the refractive index of the radiator -two radiators have been used in this analysis -. The probability density of observing the measured Cherenkov angle θ i C for a track "i" depends on various parameters and was fitted taking into account three particle species π (which also includes electrons and muons since they cannot be distinguished from pions with this method), K ± and p/p. Finally, the Likelihood function includes an additional constant term which depends on the noise.  Figure 6. Upper panels: Comparison between the different measurements of the Λ 0 scaled momentum x p (left) and log(1/x p ) (right). Here, we show the results from Aleph (red and violet), Delphi (blue) and Opal (green). Bottom panels: Same as in the upper panels but for the p/p scaled momentum (left) and its logarithm (right). For the proton case, we show the results from Aleph (red), Delphi (blue and green) and Opal (violet). In both panels, the errors correspond to statistical and systematic uncertainties summed up in quadrature. Data is taken from [77, 84-86, 88, 89, 91, 92]. identification in RICH (see section 3 of Ref. [85] for more details). The fraction of p/p particles were determined from a fit to the Cherenkov angle distribution in specific momentum ranges. The systematic errors mainly arise from the parametrization of the backgrounds.
To account for these errors, correction factors ranging from 1% to 10% have been applied using the MC simulations of JetSet event generator.
OPAL (1994)(1995)(1996)(1997). Opal has measured the spectra of charged hadrons using L = 24.9 pb −1 of data collected in 1992 [89] and of strange baryons using approximately 3.65 million events collected between 1990-1994 [87]. The determination of the hadron yields has been done from the simultaneous measurement of the track momentum and differential energy loss. Since the identification of charged hadrons cannot be done unambiguously, the Opal collaboration has used a statistical method to fit the number of particles measured in the data. Correction factors of order 20-30% have been applied to account for effects of geometrical and kinematical acceptance, nuclear corrections and decay in flight [89]. The Λ 0 baryons have been reconstructed from the tracks associated to their decay products (pπ) using two methods optimised to either have a good mass and momentum resolution or optimised to give a higher efficiency over a broader Λ 0 momentum range. The total systematic uncertainty is about 2.7% (3.3%) for the first (second) method while the statistical uncertainties are subleading.

Conclusions about the included measurements
In Fig. 6, we show the comparisons between the different experimental measurements of Λ 0 spectrum and p/p spectrum. As was pointed out previously, the measurements of baryon spectra were presented for different variables. In order to be able to easily compare between the different measurements, we scale all the normalized distributions to be either in x p or in log(1/x p ) in case they depend on a different variable, i.e. under the change of variable x E → x p , the differential normalized cross section changes as 1/σdσ/dx E → |J| −1 1/σdσ/dx p with J = ∂x E /∂x p being the Jacobian of the transformation. The normalized cross sections in x p are shown in the left panels of Fig. 6 to identify differences in the tails of the scaled momentum. On the other hand, the normalized cross sections in log(1/x p ) are very useful to display the differences in the bulk and the peak regions (right panels of Fig. 6). We can see the following differences between the different measurements: • There are some tensions between the measurement of the scaled momentum of p/p performed by Opal and the other experiments for x p > 0.1 (the Opal result is below all the others).
• The old Delphi measurement (blue) of p/p momentum is inconsistent with the new one (green) for few bins of ξ 3-3.2. Note that both these Delphi measurements cover the hole left by Aleph-1996. Furthermore, the trend of the data seems to be more consistent with Delphi-1998 rather than Delphi-1995 (very large corrections have been applied to the proton momentum in the Delphi-1995 measurement).
• The Delphi-1993 measurement of Λ 0 scaled momentum seems to be inconsistent with the others for for ξ < 1.1 (the discrepancy is mild as compared to the proton case).
Based on these observations, the tunes of the fragmentation function are not expected to find a robust best-fit point for StringZ:aExtraDiquark due to the mutual tensions between some of the measurements unless some of them are discarded in our fits. We, first, tune individually to various measurements and display the results as they are. In the more general tune that includes also the measurements of the meson scaled momenta, event shapes and mean multiplicities, we do not include the measurements of p/p and Λ 0 scaled momenta performed by Aleph-1996 [84] since the former has a hole in the peak region while the latter is superseded by Aleph-2000 measurement [88]. On the other hand, we do not include Opal-1994 measurement as it is inconsistent with all the other Log Scaled momentum fraction for Λ 0 (all events)   [84], Λ 0 scaled energy (right upper panel) [87], Log of scaled momentum for Λ 0 in all events (left lower panel) and in 2-jet event (right lower panel) [88]. The Pythia 8 prediction is shown in red, the Sherpa 2 predictions are shown in blue for the cluster (solid) and Lund (dashed) models while the Herwig 7 is shown in green for dipole (solid) and angular-based (dashed) shower algorithms.
measurements for x p > 0.1 (Pythia 8 cannot find a good agreement for this region for any choice of fragmentation function parameters). Finally, Delphi-1995 measurement of proton spectrum is not included in the four-dimensional parameter space tune as it was superseded by the Delphi-1998 measurement.

How good are the current theory predictions?
We discuss in this section the level of agreement between the theory predictions of the three commonly used multi-purpose Monte Carlo event generators and the experimental measurements of the baryon spectra at LEP (proton and Λ 0 ). For this task, we use the Pythia 8 version 307 with the baseline Monash tune [58], Sherpa version 2.2.12 [93] with a shower model based on the Catani-Seymour subtraction (CSS) method [94] and two hadronisation models: the cluster model which is provided by Ahadic++ [49] (the default in Sherpa) and the Lund string model based on Pythia 6 [95] and Herwig version 7.2.3 [96] with two radiation models: the angular-based parton-shower algorithms [97] and the dipole-based algorithm [56,98]. It is found that the different parton-shower algorithms in the three MC event generators yield similar predictions in several collider observables within the theory uncertainties (see chapter V.I. of ref. [99]). In ref. [52], we have found that the three MC event generators agree pretty well in various experimental measurements of event shapes, pion and photon spectra.
In figures 7 and 8, we show the comparison between the aforementioned generators for a set of selected measurements of proton and Λ 0 spectra at LEP. We can see that the generators based on the cluster model i.e. Herwig 7 and Sherpa 2 do not agree with data and have discrepancy of more than 40% in some regions with respect to the experimental measurement. On the other hand, Sherpa 2 with the Lund model agrees quite well with Pythia 8 as well as with data.

Fitting procedure
In this study, we use Pythia8 version 8.244 [67]  The parameters of interest are four which are a, and z ρ which controls the longitudinal momentum taking by the hadrons inside the QCD jets, a Diquark which is mainly connected to production of baryons in QCD jets. Finally, the σ parameter is connected to the mean transverse momentum taken away by a hadron in the fragmentation process (see Table 1 for their default values and their allowed range in Pythia8). Two baseline retunings are performed throughout this study: • In the first tune, we fix the parameters a, z ρ and σ to the values derived in a previous study [52], i.e. StringPT:sigma = 0.3174 +0.0420 −0.0370 , and tune a Diquark to a set of constraining measurements compromising of proton, and Λ 0 spectra. This tuning is called a one-dimensional tuning. The set of the measurements used in this optimisation part is summarised in Table 5.
• In a second tune, we fit the four-dimensional parameter space using the measurements listed in Tables 5-9. The additional measurements used in the tunes include meson scaled momenta, event shapes, jet rates, and charged multiplicities, and identified particle multiplicities. For comparison and as an extra check of the results of the frequensit fit, we further perform a Bayesian fit using MultiNest [102]. To increase the precision of the posteriors we generate one thousand live points with a tolerance of 10 −3 . Since the probability distribution functions (PDFs) associated to data are unimodal Gaussian PDFs, we expect that the results of frequentist and Bayesian fits to have a perfect agreement.
The goodness-of-fit is defined as is the analytical expression of the response function which is a polynomial of the parameters, and ∆ b is the total error. The number of degrees-of-freedom (N df ) is the number of degrees-offreedom which is defined as the number of measurements minus the number of independent parameters, i.e.
We turn now to a brief discussion of the treatment of the errors (∆ b ) used in the goodnessof-fit definition. Experimental errors are the quadratic sum of statistical and systematic uncertainties. Besides, we do not assume any correlations between the different measurements as this information is not provided by the experimental collaborations. The Monte Carlo uncertainties connected to the size of the samples used in our interpolations are summed in quadrature with the experimental errors. Finally, we add a flat 5% uncertainty on each bin and for each observable which is used as a protection against overfitting effects and as sanity limit for the accuracy in both the perturbative (high order corrections,...) and non-perturbative (high twist terms, ...) regimes. We note that the introduction of this flat uncertainty will also reduce the value of χ 2 /N df to be consistent with unity. The resulting variations around the best-fit points can reasonably defines conservative estimates of the QCD uncertainties on the predictions. The total error per bin b is defined by The polynomial dependence of the true Monte Carlo response is parametrised as follows ijk mn p i p j p k p p m p n (4.5) ijk mnr p i p j p k p p m p n p r +  figure 9 where we display the residuals for 1 st order (dashed red), 3 rd order (navy), 4 th order (cyan) and 8 th order (magenta) polynomial functions. We can see that the 8 th order polynomial performs better than the others as most of the density of the residuals is within 5% of the true MC response. Nevertheless, the 4 th order polynomial interpolation has a good performance as well (we also checked that the differences between the predictions for the polynomials of order 4 and 8 interpolations at the minimum and found good agreement). We must stress out that using polynomials with orders higher than 4 for the interpolation will cause additional overfitting effects. Therefore, we will use the 4 th order interpolation polynomial throughout this study and set ζ = θ = ω = ρ = 0 in equation 4.6. Finally, the distributions of the residuals for the errors are showing similar behavior which may be explained by the fact that we have used a large number of events for our MC sampling (2 million events per parameter point). p/p-spectrum Λ/Λ-spectrum Figure 10. Best fit point for StringZ:aExtraDiquark, the corresponding 68% errors and the associated χ 2 /N df for the different observables and their combinations. Here, we show Λ scaled momentum/energy (light green), combination of all Λ measurements (olive), p/p momenta (purple), combination of all p/p measurements (red) and the combined measurements of Λ and p/p momenta (turquoise). More details are can be found in the main text.

Results
The results of the one-parameter fits are displayed in figure 10 which are shown in the form of horizontal bar plots. For the one-dimensional tunes, we show the best-fit point for tunes to (i) Λ 0 scaled momentum/energy (green bar) including data from Aleph-1996 [84], Aleph-2000 [88], Delphi-1993 [77], and Opal-1997 [87] and to (ii) proton (scaled)momentum (blue bar) which includes data from Aleph-1996 [84], Delphi-1995 [85], Delphi-1998 [86] and Opal-1994 [89]. The tune of StringZ:aExtraDiquark resulting from a combination of Λ 0 (p/p) measurements is shown in olive (red) bar while the one resulting from a combination of all the measurements is shown in turquoise. Finally, a combination procedure described in section 6 is also shown with the label Tune-A. We first discuss the impact of the individual measurements on the best-fit point of StringZ:aExtraDiquark  StringZ:aExtraDiquark to take the largest possible. Removing these two bins will reduce the best-fit point from 1.95 to 0.91. 8 We have checked that the value of StringZ:aExtraDiquark at the minimum is almost unaffected if the OPAL 1994 S2927284 measurement of the proton (p/p) momentum is added to the fit. One must note that the net effect of including that measurement is worsening the quality of the fit since χ 2 /N df increases by almost a factor of one. This is unsurprising due to the fact that the OPAL measurement itself is inconsistent with the other experimental measurements of proton (scaled)-momentum.

Uncertainty estimates
In this section, we discuss the different sources of QCD uncertainties that may affect the particle spectra from dark-matter annihilation. We start with a discussion of the partonshower uncertainties and how they are estimated within Pythia 8. The formalism of the shower uncertainty estimates used here is based on the method developed in reference [103]. We then move to the discussion of the hadronisation uncertainties and we estimate their size. The estimate of these uncertainties is done with two different methods: Hessian variations that are widely used in the estimate of parton distribution functions (PDFs) uncertainties (see e.g. [104] for more details about the Hessian method) and a manual method which we used in a previous analysis [52].

Perturbative uncertainties
The perturbative uncertainties are split into catergories: scale-variation uncertainties and non-singular uncertainties. Before digging into the details of the estimate of these uncertainties, we first remind the reader that the parton showers in Pythia 8 are based on a dipole type p ⊥ -evolution which has been available since Pythia 6.3 [105] and is used for absorbs the leading second-order corrections in the soft-limit [71,106]. This usually tends to increase the value of the strong coupling α S (M 2 Z ) by about 10%. Therefore, it makes complete sense to estimate uncertainties that are originated from the shower scale variations which define the most important source of perturbative uncertainties. This first class of the shower uncertainties are estimated by varying the renormalisation scale by a factor of two in each direction with respect to the nominal scale choice. Let us consider a variation defined by µ R ≡ p ⊥ → kp ⊥ with k = 1/2, 2. Under this variation, the gluon-emission probability where t = p 2 ⊥ , z is the longitudinal momentum fraction, P (z) is the Dokshitzer-Gribov-Lipatov-Altarelli-Parisi (DGLAP) splitting kernel for the a → bc branching, β 0 = (11N c − 2n f )/3 is the one-loop beta function with N c = 3 being the number of colour degrees of freedom and n f is the number of active quarks with mass m q below µ R = p ⊥ . To guarantee that scale variations are as conservative as possible, a number of modifications to equation 6.2 are in order. First, the scale µ at which α S is evaluated (the second term inside the parenthesis) is chosen to be the maximum of the dipole mass m dip and kp ⊥ ; i.e. µ ≡ µ max = max(m dip , kp ⊥ ). Second, another factor ζ that depends on the singularity of the splitting is introduced. In this case, the second term inside the parenthesis of equation 6.2 is multiplied by (1 − ζ) so that the correction factor vanishes linearly outside the soft limit. We note that in Pythia 8, a limit on the allowed value of α S that changes under the scale variation is imposed i.e. |∆α S | ≤ 0.2 in order to prevent branchings near the cut-off scale from generating important changes to the event weights. The variations of the non-universal hard components of the DGLAP kernels are also possible with the new formalism. Under these variations, the shower splitting kernels transform as where Q 2 is the virtuality of the parent branching parton, and c NS is a factor that corresponds to the variations -by default c NS = ±2 is used but the user is free to change it. We close this discussion by noting that Matrix-Element Corrections (MECs), switched on by default in Pythia 8, lead to very small variations of the non-singular terms of the DGLAP splittings. It was found that switching off these corrections would lead to comprehensively larger envelopes [103]. We have checked that this the case but the error band from non-singular variations without MECs strongly overlaps with those from scale variations.
Therefore, if the total perturbative uncertainty is taken as the envelope of the variations and not their sum in quadrature then there is no major difference between switching MECs on or off. x i ). If one assumes a Gaussian probability distribution function for x i and no correlation between the different measurements, we define a global χ 2 of the variable x as The combined best-fit point value of StringZ:aExtraDiquark is obtained by minimizing the χ 2 measure defined in equation 6.4: Solving this equation will givex where the numerical value is obtained by using the results shown in figure 10. The error onx is obtained by differentiating χ 2 two times with respect tox. In this case, we obtain This basic approach leads to a small value ofσ that will be mainly controlled by the measurement with the smallest σ i . In principle, this approach does not define a conservative estimate ofσ as we can see clearly that the best-fit points x i are not stable and strongly depends on the measurement being used. Therefore, in order to improve this basic approach we inflate the errorσ by a reasonable factor which we choose to be five. The result of this approach is depicted in figure 13 where we can see the green shaded area which corresponds to the combined errors passes through most of the best-fit points. Therefore, the new

Hessian errors (eigentunes)
The Professor toolkit provides an estimate of the uncertainties on the fitted parameters through the Hessian method (also known as eigentunes). This method consists of a diagonalisation of the χ 2 covariance matrix near the minimum with H ij being the Hessian matrix which consists of second-order derivatives of the covariance matrix with respect to the parameters near the minimum , i.e H ij = ∂ 2 χ 2 /∂x i ∂x j .
The problem then consists of diagonalising the H ij matrix in the space of the optimised parameters, i.e. finding the principal directions or eigenvectors and the corresponding eigenvalues. This results in building a set of 2 · N params variations. These variations are then obtained as corresponding to a fixed change in the goodness-of-fit measure which is found by imposing a constraint on the maximum variation, defined as a hypersphere with maximum radius of T (defined as the tolerance), i.e. ∆χ 2 ≤ T . Therefore one can define the ∆χ 2 to match a corresponding confidence level interval; i.e. one-sigma variations are obtained by requiring that ∆χ 2 N df where N df is the number of degrees-of-freedom defined in equation 4.3. This approach defines a conservative estimate of the uncertainty if the event generator being used has a good agreement with data (which is usually quantified by χ 2 min /N df ≤ 1) and the resulting uncertainties provide a good coverage of the errors in the experimental data. To enable for this situation, we have added an extra 5% uncertainty to the MC predictions for all the observables and bins (see section 4) which already implied a χ 2 /N df ≤ 1 in our fits as depicted in Table 2. On the other hand, we enable for large uncertainties by considering not only the one-sigma eigentunes but also the two-sigma eigentunes (correspond to ∆χ 2 /N df = 4) and the three-sigma eigentunes (correspond to ∆χ 2 /N df = 9). The three-sigma eigentunes provide a very good coverage of all the experimental uncertianties in the data for meson and baryon spectra but results in unreasonably large uncertainties that overshot the experimental errors for e.g. event shapes or jet rates.
The variations corresponding to the one-sigma, two-sigma, and three-sigma eigentunes are shown in Table 3.
In figure 14, we show the comparison between the theory predictions at the best-fit points (including the envelope from theory uncertainties) and the experimental data for the p spectrum (left) and Λ 0 scaled energy (right). We can see that all the theory predictions agree pretty well with data within the uncertainty envelopes where a large uncertainty coverage clearly visible in the case of two-sigma and three-sigma eigentunes. On the other hand, the envelope spanned from the variations around the best-fit point of the weighted fit (shown in cyan) is somehow located as in between the one-sigma and the two-sigma eigentunes. There are, however, some regions in the p spectrum where all the variations seem to cancel each other (specifically in the few bins between x p = 0.1 and x p = 0.3).
The resulting uncertainty can range from about 5% in the low x E region to about 40% in the high x E region. In order to be as conservative as possible while in the same do not allow for very large variations we can define either the envelopes from the weighted fit or from the two-sigma eigentunes as our estimate of the uncertainty on the baryon spectra.
As the uncertainty estimate from the weighted fits is computationally more expensive, we will use the two-sigma eigentunes in the rest of the paper and also for the new data tables that can be found in this github repository 9 .
9 To obtain the anti-matter spectra along with the hadronisation uncertainties from the weighted fit one needs to generate twenty seven MC samples for each dark matter mass. Therefore, we recommend the user who is interested in producing the spectra himself to use the variations provided in Table 3 which will require nine runs instead of twenty seven and thus significantly reduce the running time.  Table 2 (red, green, and blue) and equations 4.2, 6.8 (cyan). The uncertainty envelopes corresponding to the one-sigma, two-sigma, and three-sigma eigentunes are shown in red, green and blue while the uncertainty envelope from the 26 variations around the best fit points resulting from the weighted fit is shown as cyan shaded area. Data is taken from [84,87] .

Assessing QCD uncertainties on anti-matter spectra
In this section, we quantify the impact of QCD uncertainties on particle spectra from darkmatter annihilation for few dark matter masses and annihilation channels. The results will be shown in the x variable defined by with E kin is the kinetic energy of the particle specie (antiproton, positron, neutrino and photon), m is its mass and M χ is the dark matter mass. We study the following annihilation channels: For the qq annihilation channel, we assume that the dark matter is annihilated to all the quarks except the top quark with BR(χχ → qq) = 0.2, q = u, d, s, c, b. For the V V channel, we include both the ZZ and W + W − channels with equal probabilities: i.e. BR(χχ →

Application to dark-matter indirect detection experiments
In this section we quantify the effects of QCD uncertainties on two DM observables: the velocity-weighted effective cross section, σv , and the DM mass M χ . We first discuss the general methodology for determining the DM uncertainties arising from QCD uncertainties.
We then discuss the results for an antiproton final state. This is followed by a discussion of electron antineutrinos and photons final states as a proof of principle.

Dark Matter uncertainties
In DM indirect searches, there are generally two important parameters: the velocityweighted effective cross section, σv 10 , and the DM mass, M χ . The inclusion of QCD uncertainties will translate into uncertainties in determination of these two DM observables. The effect on σv is simply an overall shift in the height of the spectrum; varying σv is identical to changing the normalization of the spectrum. The uncertainty concerning the DM mass is quite straightforward, since a change in the DM mass changes the spectrum including the peak position. To quantify the effects of the QCD uncertainties on the DM mass, we consider two different approaches. First, for a spectrum coming from a DM particle with mass M χ , which different masses have spectra including QCD uncertainties that can match the spectrum of M χ , i.e., which masses can look the same as M χ . Or, the spectra of which masses can match the spectra of M χ including QCD uncertainties, i.e., with which masses can M χ look similar. We will call these two different approaches to finding the DM mass uncertainty ∆M and ∆S, respectively. To quantify which spectra agree when QCD uncertainties are included, we require that for the upper and lower bounds of the DM uncertainties χ 2 /N d.f. ≈ 1 holds. Although not proven, we consider it safe to assume that χ 2 /N d.f. strictly increases for larger deviating values of DM mass or spectral height.
We have written a public code to both interpolate spectra for a given DM mass, channel, and final state, and to compute the upper and lower bounds for the three aforementioned DM uncertainties: ∆ σv , ∆M , and ∆S. Additionally, the considered fit region can be user supplied. The code can be found in this github repository.
For both electron antineutrino and photon final states the spectrum does not need to be propagated, as these particles can propagate freely. Thus the DM uncertainties can be determined directly from the annihilation spectrum. However, for the antiproton final state, the spectrum must first be propagated before the upper and lower bounds of the uncertainties can be determined. Our code emulates cosmic-ray propagation by diffusing each bin using tabulated diffusion values, if the bin is not tabulated the diffusion values are linearly interpolated. By performing the diffusion for every bin, the complete propagated spectrum is obtained. We use Dragon 2 [107,108] to compute the tabulated diffusion values by inserting an antiproton spectrum for the DM annihilation spectrum that is peaked at a specific bin. The propagation parameters are determined by fitting the AMS-02 [109] proton p, antiproton over protonp/p and Boron over Carbon ratio B/C spectra using an artificial bee colony as the optimization algorithm [110,111]. The resulting propagation parameters can be found in the README of the code. This fit resulted in a DM mass of O(100GeV), which may affect results for other DM masses.
To assess the effects of the different propagation parameters, we cross-compared a set 10 In principle, the DM density can also be used instead of the velocity-weighted effective cross section.
of parameters from [112] and the default settings from Dragon 2. The variation in the DM uncertainties between these sets is about 1-5%. We consider these discrepancies to be small enough that it is safe to use our fitted parameters for the following results.
Our default tabulated diffusion values do not account for solar modulation; only a lowenergy correction factor for the diffusion coefficient is implemented. To account for this, we have fitted antiproton final-state spectra down to 1 GeV. While solar modulation is relevant at these energies, QCD uncertainties are typically small in this range, so we do not expect the inclusion of solar modulation to significantly change our results. Regardless, the tabulated diffusion values can be input by the user to meet situation-specific requirements.
In particular, positron propagation is not implemented, so all results for positron final states were obtained using the annihilation spectrum and should therefore be treated with caution. In addition, the results of our code may be unreliable in certain cases for very small values of differential flux or QCD uncertainties due to numerical problems. We recommend that all results be checked for accuracy.

Results for antiprotons
In order to showcase the impact of the QCD uncertainties on the antiproton spectrum and consequently the DM observables, we show ∆S for all channels in figure 17  limits. This is because if one consistently considers the fitting range down to 1 GeV, the fraction of low x regions increases. We have checked the following statements for both M χ = 500 GeV and M χ = 1000 GeV. For uū, dd, cc, and gg, no significant change occurs in the relative size of ∆S, which is evident when considering figure 17. The W + W − , ZZ, and hh obtain relatively smaller DM uncertainties, as is also expected since the fitted spectra for these channels diverge in the low-x regions. A relative increase in ∆S is seen for bb, the upper limit of ss, and the upper limit of tt, as is also expected.
While these results refer to ∆S, the conclusions can easily be applied to ∆M , since in both cases it is a matter of shifting the mass. The main difference, of course, is that the QCD uncertainties are different for different DM masses, so the numerical values for ∆S and ∆M may be different. This will be made explicit in the following section dealing with electron antineutrino and photon final states.

Results for electron antineutrinos and photons
In the following, we consider only electron antineutrinos and photons to show the effects of QCD uncertainties. The results for muon and tau antineutrinos may be significantly  different from those for electron antineutrinos because their spectra differ considerably, as can be seen in Figure 19. The DM uncertainties for antineutrinos and photons can be determined directly from their annihilation spectra without the need for CR propagation.
We show the three different DM uncertainties, ∆ σv , ∆M , ∆S, for a DM mass of 200 GeV for an electron antineutrino and photon final state in Table 4. We consider the energy range down to x = 10 −5 /M χ for antineutrinos and x = 10 −3 /M χ for photons, since for lower values of x the QCD uncertainties become erratic. For both final states, we additionally perform a fit that includes and excludes the high-x region of x ∈ (0.5, 1] to quantify the effects of these regions From the table 4 it can be seen that for the W + W − , ZZ, and HH channels the inclusion of the high-x region is very important, especially for aν e final state. Figure 18 clearly shows the small QCD uncertainties in the high-x regions for both W + W − and ZZ, and with or without the high-x region. There are some exceptions though, e.g., the upper limit of ∆M for χ → gg →ν e . For a tt-mediatedν e or γ final state, the QCD uncertainties are more comparable to the V V or HH channels than to the qq or gg channels, as can be seen from figure 21. This is reflected in the DM uncertainties, which are indeed more comparable to V V and HH.
In general, the magnitude of the DM uncertainties for bothν e and γ final states depends strongly on the channel and the considered fitting region, even more so than for an antiproton final state. The impact of the QCD uncertainties on the DM observables can be as large as 20% in certain scenarios and therefore must be considered in future DM indirect detection studies.

Conclusions
In this work, we have studied the QCD uncertainties on antimatter spectra from darkmatter annihilation and therefore completing the series of analyses related to QCD uncertainties in dark-matter indirect searches where we studied gamma-ray spectra in ref.
[52] and antiproton spectra in ref. [63]. After studying the general features of antimatter production in dark-matter annihilation within Pythia 8, we studied in detail the origin of antimatter for various annihilation channels taking M χ = 1000 GeV as an example. We found that the spectra of antimatter can be modeled correctly if the spectrum of charged pions, antiprotons and hyperons measured at LEP is modeled properly. We have performed a detailed analysis of baryon production at LEP (especially antiprotons and hyperons). We have found some tensions between the different measurements at LEP and selected the most reliable measurements for our tunes. Then, we have compared between the predictions of the state-of-art MC event generators, i.e., Herwig 7, Pythia 8 and Sherpa 2. We found that MC event generators based on the string hadronisation model have a good agreement with data while the MC event generators based on cluster hadronisation model have very poor agreement with the experimental measurements with disagreement reaching up to 50% in some kinematic regions. This comparison also suggests that the envelope spanned between the different MC event generators cannot define a faithful estimate of the QCD uncertainties since for photons and charged pions they agree pretty well in the peak region (see [52]) while for protons and hyperons they have very large differences. Therefore, we studied the alternative scenario where we estimate the QCD uncertainties within Pythia 8.

First we performed several returnings of the fragmentation-function parameters in
Pythia 8 with the Monash 2013 tune as our baseline and using a set of constraining measurements at LEP totalling 48 measurements and 856 bins. The resulting tune yields very good agreement with the experimental measurements with χ 2 /N df 1 for most of the observables. We then estimated the QCD uncertainties that arise from parton-shower scale evolution variation and from hadronisation modeling using some parametric variations around the best-fit points of the hadronisation model. We studied the impact of these uncertainties on antimatter and photon spectra. We found that the QCD uncertainties are highly dependent on the annihilation channel, the DM mass, the particle specie and the energy region. A notable example is the antiproton spectrum where the hadronisation uncertainties dominate the particle-physics error budgets and can reach 10%-20% in the bulk and the peak of spectra and up to 50% in the high-x region. The QCD uncertainties on the other antimatter species are highly dependent on the annihilation final state but are around 10%-15% depending on the annihilation channel and DM mass.
We finally analysed the impact of these QCD uncertainties on the DM indirect detection fits using realistic CR propagation models for antiprotons, electron antineutrinos and photons. We have considered various annihilation channels that lead to hadronic final states -uū, dd, ss, cc, bb, tt, W + W − , ZZ, hh, and gg -. were also studied where we found important consequences. The size of QCD uncertainties are negligibly dependent on the choice of the diffusion parameters. Further measurements of the antimatter spectra at the Large Hadron Collider can be very important as they will deliver additional information that are necessary to improve the theory predictions and reduce the associated uncertainties. Therefore, we recommend the DM groups to start using these results for their future analyses. For this purpose, we provide the spectra in tabulated form including QCD uncertainties and some code snippets to perform fast DM fits (can be found in this github repository). The tables can also be found in the latest releases of DarkSusy 6 [113], MicrOmegas 5 [114] and MadDM [115].   Table 6. Same as for Table 5 but for the charged multiplicity distributions. Data is taken from [116,117].

B QCD uncertainties in anti-matter spectra: Additional plots
In this section, we should the spectra of antimatter and photons in dark-matter annihilation for M χ = 100 GeV (figures [18][19] and for M χ = 1000 GeV (figures 20-21). in dark matter annihilation into qq (red), gg (green) and V V (blue). Here, the dark matter mass is chosen to be 100 GeV. For each pane, the dark shaded band corresponds to the parton-shower uncertainties while the light shaded band corresponds to hadronisation uncertainties.  Figure 19. Same as for figure 18 but for muon antineutrinos (left) and tau antineutrinos (right).