1 Introduction

In order to simulate a hadron–hadron collision event, standard Monte Carlo (MC) event generators, such as pythia 8 [1], are based on a factorized ansatz, for which any hadron-hadron cross section can be written as a product of two non-perturbative process-independent parton density functions (PDF), one for each of the colliding protons, and a perturbative parton-parton cross section. The so-called underlying event (UE) represents the whole additional activity which occurs at lower scales accompanying the hard scattering, and consists of several components, such as initial- and final-state radiation (ISR and FSR, respectively), multiple parton interactions (MPI), and beam–beam remnants (BBR). All the coloured partons produced by these processes are finally rearranged into colourless hadrons, during the hadronization process. Due to all the different contributions, the resulting hadronic collision is a complex multiparticle process. Particularly relevant for the characterization of the UE are the MPI, which consist of numerous additional 2-to-2 parton–parton interactions, occurring within the single collision event. Due to the large increase of the parton density at small longitudinal momentum fractions x, the MPI contribution increases with increasing collision energy.

The perturbative partonic cross section of a generic process with two incoming and two outgoing partons (the so-called \(2 \rightarrow 2\) processes), as a function of the exchanged transverse momentum \(p_{\text {T}}\), can be expressed as:

$$\begin{aligned} \frac{\text {d}\hat{\sigma }}{\text {d}p_{\text {T}}^2} \propto \frac{\alpha _{\text {s}}^2\left( p_{\text {T}}^2 \right) }{p_{\text {T}}^4} , \end{aligned}$$
(1)

where \(\alpha _{\text {s}}\) is the strong coupling. By integrating the cross section over \(p_{\text {T}}^2\), one obtains:

$$\begin{aligned} \hat{\sigma } \propto \frac{1}{p_{\text {T}}^2}, \end{aligned}$$
(2)

which shows that the total \(2 \rightarrow 2\) cross section tends to diverge at small values of exchanged transverse momentum. While the simulation of the hard scattering generally involves relatively large \(p_{\text {T}}\) values (\(p_{\text {T}} > 5 \, \hbox {GeV}\)), generated MPI processes might reach very low \(p_{\text {T}}\) scales (\(p_{\text {T}} \sim 1 \, \hbox {GeV}\)) where the rapid increase of the partonic cross section becomes relevant and might lead to unphysical results. In order to tame the behaviour of the partonic cross section as a function of \(p_{\text {T}}\), the Pythia 8 event generator introduces a regularization, by shifting the value of \(p_{\text {T}}\) by a quantity \(p_{\text {T0}}\), leading to a formulation of the partonic cross section as follows:

$$\begin{aligned} \frac{\text {d}\hat{\sigma }}{\text {d}p_{\text {T}}^2} \propto \frac{\alpha _{\text {s}}^2\left( p_{\text {T}}^{2}+p_{\text {T0}}^{2}\right) }{\left( p_{\text {T}}^2+p_{\text {T0}}^{2}\right) ^2}, \end{aligned}$$
(3)

which after integration becomes:

$$\begin{aligned} \hat{\sigma } \propto \frac{1}{p_{\text {T}}^2+p_{\text {T0}}^2}. \end{aligned}$$
(4)

Such a cross section does not present any divergence for \(p_{\text {T}} \rightarrow 0\) any longer. In the simulation, \(p_{\text {T0}}\) serves as a phenomenological parameter, which can not be obtained from any first principle, but must be determined from data. Many studies have been performed [2, 3], in order to determine the values of \(p_{\text {T0}}\) which, as a function of the center-of-mass energy \(\sqrt{s}\), is generally between 1.2 and 2.5 GeV in the \(\sqrt{s}\) range of 300–13,000 GeV. The Pythia 8 MC event generator implements an energy dependence of the \(p_{\text {T0}}\) parameter, according to a power law of the form:

$$\begin{aligned} p_\mathrm{T0}(\sqrt{s})= p_\mathrm{T0}^\mathrm{ref} \left( \frac{\sqrt{s}}{\sqrt{s_0}}\right) ^\epsilon , \end{aligned}$$
(5)

where \(p_\mathrm{T0}\) is the regularizator of the partonic cross section, which solves its divergent behaviour for \(p_{\text {T}} \rightarrow 0\), \(p_\mathrm{T0}^\mathrm{ref}\) is the \(p_\mathrm{T0}\) at a reference energy \(\sqrt{s_0}\), and the parameter \(\epsilon \) determines the energy dependence. This formulation of the \(p_{\text {T0}}\) energy extrapolation follows the same energy dependence as the total hadronic cross section. No other options for the \(p_{\text {T0}}\) energy dependence are available in Pythia 8. Previous studies have shown the difficulty of describing measurements sensitive to the UE at various center-of-mass energies [2, 3] in the range of 300–7000 GeV, measured at different colliders.

At a given center-of-mass energy, the amount of simulated MPI in Pythia 8 depends on the turn-over \(p_{\text {T0}}\), the PDF, and the overlap of the matter distributions of the two colliding hadrons. The MPI processes produce a lot of coloured partons in the final state, creating a dense net of colour lines which spatially overlap with the fields produced by the partons of the hard scattering and with each other. All the generated colour lines may be connected between each other according to the so-called colour reconnection (CR). The CR mechanism implements the possibility for different colour strings to be reconnected and to exchange colour information.

In MC event generators, the PDF are a crucial ingredient for the simulation of both the hard scattering and the UE, as they parametrize the distributions of the partons inside each hadron which cannot be calculated analytically a priori. However, they can be extracted from fits to the data. These fits use analytical calculations of the hard scattering performed at a certain order of the strong coupling \(\alpha _S\). From a fit with a leading-order (LO) calculation in \(\alpha _S\), one obtains a LO PDF set, with next-to-leading (NLO) or next-to-next-to-leading (NNLO) order calculations, respectively a NLO or NNLO PDF set is obtained. One of the striking differences between LO PDF sets and NLO or NNLO ones is the gluon distribution at small values of longitudinal momentum fractions x and scales \(Q^2\), which is rather flat for NLO and NNLO PDF sets while tends to increase quickly for LO PDF sets at small x. Note that all PDF sets have a significant uncertainty in this region, and the “correct” behaviour of the gluon distribution at small x is not yet established. Furthermore, LO PDF can be directly interpreted as parton densities inside the proton and their behaviour can be easily related to measurable quantities, while this is not obvious for NLO or NNLO PDF sets. Because of these reasons, the usage of LO PDF sets for the UE simulation are generally preferred, but nothing prevents to achieve a good description of the measurements, if one uses NLO or NNLO PDF sets.

The goodness of the UE simulation provided by MC event generators and corresponding tunes can be tested by comparing predictions with available data. These data are generally measurements of the number of charged particles and their transverse momentum sum in different regions of the phase space relative to the direction of the hardest objects in the event. In particular, the hard object, which might be a jet, a charged particle or a Z boson, identifies a direction in the transverse plane. The transverse plane is then divided into four regions, according to their azimuthal angle: a “toward” and an “away” region sensitive to the hard scattering and its recoiling object, and two “transverse” regions, more sensitive to UE contributions. In recent measurements, the two transverse regions are further divided into separate measurements. The transverse region with the highest activity is called “transMAX” while the one with the smallest activity is labelled as “transMIN”. The charged-particle multiplicity and the transverse momentum sum of the charged particles, measured as a function of the transverse momentum of the leading charged particle, are referred to as “UE observables” in the following.

In this document, by using UE observables at various collision energies, we determine separately the \(p_{\text {T0}}\) values, which best fit the measurements. We use predictions of the pythia 8.226 event generator produced with various PDF sets evaluated at different order in \(\alpha _S\). After observing that the obtained \(p_{\text {T0}}\) values are not properly predicted by Eq. 5, we introduce a modification of the energy extrapolation by introducing an additional term, according to the formula:

$$\begin{aligned} p_{\text {T0}}(\sqrt{s})= p_{\text {T0}}^{\text {ref}} \, \left( \frac{\sqrt{s}}{\sqrt{s_0}}\right) ^\epsilon + c \end{aligned}$$
(6)

where the quantity c is a free energy-independent parameter.Footnote 1 The new proposed term of the energy dependence is an attempt of achieving better predictions of \(p_{\text {T0}}\) values. However, it does not significantly modify the structure of the power law already implemented in Pythia 8 and the additional term is expected to introduce a little correction with respect to Eq. 5.

2 Determination of \(p_{\text {T0}}\) values at various energies

In order to determine the best values of \(p_{\text {T0}}\), fits to observables sensitive to contributions of MPI at soft and semi-hard scales are performed independently at various collision energies. The considered observables are the charged particle multiplicity and average \(p_{\text {T}}\) sum densities as a function of the leading charged-particle transverse momentum, \(p_{\text {T}}^{max}\) in the transMIN and transMAX regions. Five different sets of measurements of these observables are considered at various collision energies: 300, 900 and 1960 GeV measured by the CDF experiment [4], 7000 GeV measured by the CMS experiment [5] and 13 TeV measured by the ATLAS experiment [6]. The region between \(0.5<p_{\text {T}}^{max} <1\,\hbox {GeV}\) and the region between \(0.5<p_{\text {T}}^{max}<3\,\hbox {GeV}\) are excluded by the fits performed, respectively, at \(\sqrt{s}=7\,\hbox {TeV}\) and \(\sqrt{s}=13\,\hbox {TeV}\), since they are found to be affected by contributions of diffractive processes, whose free parameters are not considered in the tuning procedure.

Fits are performed for three different PDF sets released by the NNPDF31 collaboration [7]. They refer to the LO, NNPDF31_lo_as_0130, the NLO, NNPDF31_nlo_ as_0118, and NNLO, NNPDF31_nnlo_as_0118, sets. All fits use as baseline the hadronization parameters of the Monash tune [2]. Additionally, they use a range for colour reconnection probability equal to 2.17 and an overlap matter distribution modelled by a double gaussian function with radius and fraction matter in the core equal to, respectively, 0.43 and 0.46. These values of parameters were obtained from preliminary tuning attempts using the LO NNPDF31 sets and they were used also for fits with the other PDF fits for consistency.

The energy reference used for the extrapolation is set to the collision energy of the considered data points used in the fits. This translates into \(p_{\text {T0}}\ = p_{\text {T0}}^{\text {ref}}\) in Eq. 5. This choice is important in order to reduce the number of fitted parameters at each energy and to eliminate any energy dependence in each single fit. The parameters used in the pythia 8 configuration are listed in Table 1.

Table 1 Values of the parameters used in the PYTHIA 8.226 MC event generator related to the overlap matter distribution function and colour reconnection probability. For the \(p_{\text {T0}}\) parameter, which is fitted in the tuning procedure, the considered range for the fits is indicated
Table 2 The values of the \(p_{\text {T0}}\) parameter obtained from the fits to underlying-event observables at the various energies. The uncertainty quoted for each \(p_{\text {T0}}\) represents the value obtained in the fit, when allowing an up/down variation of the \(\chi ^2\), equal to the absolute obtained \(\chi ^2\). Also indicated for each energy is the goodness of fit divided by the number of degrees of freedom
Fig. 1
figure 1

Fit results of the \(p_{\text {T0}}\) values according to the energy extrapolation used by default in pythia 8 (Eq. 5) for the three different PDF sets. The values of the obtained parameters as well as the goodness of fit is shown in the plot legend. The energy E is expressed in TeV and an energy reference of 7 TeV is used

Fits to the UE observables are performed by using both the Professor 1.4.0 [8] and RIVET 2.4.0 [9] software. About 30 different choices of the \(p_{\text {T0}}\) parameter are considered for building the set of anchor points in the one-dimensional parameter space. For each choice of parameters, two million events are generated, so that for each considered bin, the statistical uncertainty of the MC predictions is smaller than the uncertainty of the experimental data. It has been checked that the bin-by-bin envelopes of the different MC predictions encompass well the data points. After running the different predictions, Professor performs an interpolation of the bin values for the considered observables as a function of \(p_{\text {T0}}\), according to a third-order polynomial function. We checked that the degree of the polynomial used for the interpolation does not influence the tune results. The obtained function \(f^{\text {b}}(p)\) describes the MC response of each bin b as a function of the vector of the parameters p. The final step is the minimization of the \(\chi ^2\) function given by the formula:

$$\begin{aligned} \chi ^2(p_{\text {T0}})=\sum _{O}w_0\sum _{\text {b}\in O}\frac{\left( f^{\text {b}}(p_{\text {T0}})-R_{\text {b}}\right) ^2}{\Delta _{\text {b}}^2} \end{aligned}$$
(7)

where \(R_{\text {b}}\) is the data value for each bin b and \(\Delta _{\text {b}}\) expresses the total bin uncertainty of the data. The experimental uncertainties are assumed to be uncorrelated between data points. The minimization procedure gives in return the values of the parameters which are able to best describe the considered data.

3 Results

The \(p_{\text {T0}}\) values obtained from the fits to the data at the various considered energies, as well as the value of the goodness of the fit, are shown in the Table 2.

The different PDF sets show quite different values of \(p_{\text {T0}}\). In particular, the LO PDF set requires more rapidly changing values as a function of energy, than NLO and NNLO PDF sets. This is the impact of the different behaviour of the gluon distribution at small x values, which are more relevant for higher collision energies. In order to reproduce the UE observables, a larger gluon density prefers a smaller amount of MPI contributions (which translates into a larger \(p_{\text {T0}}\) value), while larger MPI contributions, i.e. smaller \(p_{\text {T0}}\) values, are needed for smaller gluon densities. The \(p_{\text {T0}}\) values obtained for the NLO and NNLO PDF sets are very similar to each other.

The phase space in terms of parton longitudinal momentum fraction relevant for MPI lies in the region between \(10^{-5}< x < 10^{-3}\) at \(Q^2\) scales within a range of 10–100 \({\hbox {GeV}}^2\). In this phase space, gluons are the most relevant component of the proton content and the considered LO and higher-order gluon distributions are very different. In particular, the densities of the LO NNPDF3.1 set are larger than the NLO and NNLO NNPDF3.1 sets of up to a ratio of 2. This is the main origin of the different \(p_{\text {T0}}\) values obtained for the various tunes. The fact that the gluon distribution at small x is quite flat for the NLO and NNLO PDF sets has the effect that the \(p_{\text {T0}}\) values are very weakly dependent on the energy. They range between 1.4 and 1.95 in the energy range of 0.3–13 TeV.

The parameters obtained for the NLO- and NNLO-based tunes confirm what was stated by the Pythia 8 authors [10], that tunes based on higher-order PDF sets would require lower \(p_{\text {T0}}\) values in order to describe underlying-event observables than tunes based on LO PDF sets. Furthermore, very similar \(p_{\text {T0}}\) values were obtained by the tunes performed by the ATLAS collaboration [11], which used the NNPDF2.1 NLO and MSTW2008 NLO PDF sets.

The results show that for all measurements and considered PDF sets, we are able to obtain a \(\chi ^2\)/Ndf value close to 1. The relative uncertainties obtained for the \(p_{\text {T0}}\) values are all of the order of 1–2%. The \(p_{\text {T0}}\) values constitute the input for the fits according to the functions in Eqs. 5 and  6. The results of the fitting procedure are shown in Figs 1 and  2 for the two functions, respectively. Table 3 summarizes the parameters of the two functions, as well as the goodness of fit divided by the number of degrees of freedom.

Fig. 2
figure 2

Fit results of the \(p_{\text {T0}}\) values according to the energy extrapolation used by default in pythia 8 (Eq. 6) for the three different PDF sets. The values of the obtained parameters as well as the goodness of fit is shown in the plot legend. The energy E is expressed in TeV and an energy reference of 7 TeV is used

Table 3 Summary of the obtained \(p_{\text {T0}}\) parameters for the two fitted functions based on LO, NLO and NNLO PDF sets. Also shown is the goodness of fit divided by the number of degrees of freedom. The energy E is expressed in TeV and an energy reference of 7 TeV is used
Table 4 Values of \(p_{\text {T0}}\) for \(\sqrt{s} = 100 \, \hbox {TeV}\), as predicted by the old fit (Eq. 5) and the new fit (Eq. 6)

Both functions are able to follow the trend of the \(p_{\text {T0}}\) values in the considered energy range. Applying Eq. 5, the low-energy points (300 and 900 GeV), are not well described giving the relatively high \(\chi ^2\)/Ndf values, which go up to 2.97 for the NNLO PDF set. By including the additional term in the energy dependence, the behaviour at low energy significantly improves and the value of \(\chi ^2\)/Ndf decreases down to values close to 1.

It has been checked that predictions obtained with the new tunes based on LO, NLO and NNLO PDF sets described in this paper are able to reproduce well inclusive minimum-bias and underlying-event observables, such as charged particle multiplicities and energy flow measured at central and forward rapidities [12,13,14,15] at various collision energies, and variables sensitive to colour reconnection effects [6, 16]. Additionally, observables sensitive to final-state contributions and to the modelling of hadronization effects measured at LEP [17] are also well reproduced. A level of agreement similar to the one achieved by the Monash tune is obtained.

Fig. 3
figure 3

Predictions of the pythia 8 tunes at \(\sqrt{s} = 100 \, \hbox {TeV}\), obtained with the various PDF sets and the \(p_{\text {T0}}\) obtained from Eqs. 5 or 6, are shown for average charged particle multiplcities (top plots) and average transverse momentum sum (bottom plots) in the transMIN and transMAX regions, as a function of the transverse momentum of the leading charged particle (\(p_{\text {T}}^{\text {max}}\)). Curves labelled as “Old fit” refer to predictions using \(p_{\text {T0}}\) as predicted by Eq. 5, while curves labelled as “New fit” use the \(p_{\text {T0}}\) value as predicted by Eq. 6 Below each panel, the ratios of all predictions to the ones obtained with the LO PDF set and \(p_{\text {T0}}\) from Eq 5 are displayed

Furthermore, it has been checked that the modified energy depedence interpolates well, i.e. it gives reliable predictions of \(p_{\text {T0}}\) values within the fitted range, for instance, at \(\sqrt{s} = 2.76 \, \hbox {TeV}\). By comparing Pythia 8 predictions obtained with the \(p_{\text {T0}}\) value at \(\sqrt{s} = 2.76 \, \hbox {TeV}\) from Eq. 6 and the parameters of Table 1, a very good level of agreement is obtained for the UE data measured by CMS at that energy [18]. The modified energy dependence function can also be reliably used for extrapolation of \(p_{\text {T0}}\) values at energies smaller than 300 GeV and higher than 13 TeV and constitutes a alternative to the default Pythia 8 energy extrapolation.

4 Predictions at \(\sqrt{s} = 100 \, \hbox {TeV}\)

The new extracted energy dependence can be used to extrapolate \(p_{\text {T0}}\) values at high energies. In this Section, we show results for \(\sqrt{s} = 100 \, \hbox {TeV}\). Note that for the phase space relevant for such an energy, i.e. the gluon distribution at small x values, the current PDF sets are not constrained by any data but only extrapolated from measurements at lower energies. In Table 4, the \(p_{\text {T0}}\) values for the different PDF sets are listed as predicted by Eq. 5, referred to as “old fit”, and by Eq. 6, referred to as “new fit”. While a very small difference is observed between old and new fits for NLO and NNLO PDF sets, the two \(p_{\text {T0}}\) values for the LO PDF set differ between each other. This is due to the fact that tunes using a NLO or a NNLO PDF set prefer a very weak \(p_{\text {T0}}\) energy dependence, in order to describe measurements at various collision energies. Instead, in tunes using a LO PDF set, one needs a more rapidly increasing \(p_{\text {T0}}\) as a function of energy, which is differently predicted by the old and the new fit.

Figure 3 shows predictions using the various PDF sets on the average charged-particle multiplicity and average charged-particle transverse momentum sum in the transMIN and transMAX regions, as a function of the transverse momentum of the leading charged particle (\(p_{\text {T}}^{\text {max}}\)), at \(\sqrt{s} = 100 \, \hbox {TeV}\). While predictions obtained with the tunes based on NLO and NNLO PDF sets are very similar to each other, independently of the considered energy extrapolation, predictions from the LO tunes differ of up to 10% between each other. In particular, the new fit predicts a higher \(p_{\text {T0}}\) value, and consequently a lower activity in terms of number of charged particles and of transverse momentum. Predictions obtained with NLO and NNLO PDF sets are significantly lower than predictions obtained with LO PDF sets, of less than 10% if the new fit is used and up to 20% if the energy extrapolation is carried out through the old fit. By performing such measurements at a high collision energy, e.g. 100 TeV, one may be able to validate the performance of the energy extrapolation functions, considered in this document.

5 Summary and conclusions

The energy dependence of the \(p_{\text {T0}}\) parameter in the pythia 8 Monte Carlo event generator has been investigated. From observables sensitive to multiparton interactions at low and semi-hard scales at various collision energies, we find that the inclusion of an additional term to the simple power-law function implemented in pythia 8 significantly improves the description of \(p_{\text {T0}}\) values obtained at collision energies of 300 and 900 GeV, inferred by measurements performed at the CDF experiment. This conclusion holds for Pythia 8 predictions using parton distribution functions determined at leading, next-to-leading, or next-to-next-to-leading order in the strong coupling for the underlying event simulation. The additional term is found to be very similar for all parton densities. The modified energy dependence function can be reliably used for extrapolation of \(p_{\text {T0}}\) values at energies smaller than 300 GeV and higher than 13 TeV and constitutes a valuable alternative to the default Pythia 8 energy extrapolation.