1 Introduction

Monte Carlo (MC) event generators of hadron–hadron collisions based on perturbative quantum chromodynamics (QCD) contain several components. The “hard-scattering” part of the event consists of particles resulting from the hadronization of the two partons (jets) produced in the hardest scattering, and in their associated hard initial- and final-state radiation (ISR and FSR). The underlying event (UE) consists of particles from the hadronization of beam-beam remnants (BBR), of multiple-parton interactions (MPI), and their associated ISR and FSR. The BBR include hadrons from the fragmentation of spectator partons that do not exchange any appreciable transverse momentum (\(p_{\mathrm {T}}\)) in the collision. The MPI are additional 2-to-2 parton-parton scatterings that occur within the same hadron–hadron collision, and are softer in transverse momentum (\(p_{\mathrm {T}} \lesssim 3\,\text {GeV} \)) than the hard scattering.

The perturbative 2-to-2 parton-parton differential cross section diverges like \(1/\hat{p}_\mathrm{T}^4\), where \(\hat{p}_\mathrm{T}\) is the transverse momentum of the outgoing partons in the parton-parton center-of-mass (c.m.) frame. Usually, QCD MC models such as pythia  [15] regulate this divergence by including a smooth phenomenological cutoff \(p_\mathrm{T0}\) as follows:

$$\begin{aligned} 1/\hat{p}_\mathrm{T}^4\rightarrow 1/(\hat{p}_\mathrm{T}^2+p_\mathrm{T0}^2)^2. \end{aligned}$$
(1)

This formula approaches the perturbative result for large scales and is finite as \(\hat{p}_\mathrm{T}\rightarrow 0\). The divergence of the strong coupling \(\alpha _\mathrm{s}\) at low \(\hat{p}_\mathrm{T}\) is also regulated through Eq. (1). The primary hard 2-to-2 parton-parton scattering process and the MPI are regulated in the same way through a single \(p_\mathrm{T0}\) parameter. However, this cutoff is expected to have a dependence on the center-of-mass energy of the hadron–hadron collision \(\sqrt{s}\). In the pythia MC event generator this energy dependence is parametrized with a power-law function with exponent \(\epsilon \):

$$\begin{aligned} p_\mathrm{T0}(\sqrt{s})= p_\mathrm{T0}^\mathrm{ref} \, (\sqrt{s}/\sqrt{s_0})^\epsilon , \end{aligned}$$
(2)

where \(\sqrt{s_0}\) is a given reference energy and \(p_\mathrm{T0}^\mathrm{ref}\) is the value of \(p_\mathrm{T0}\) at \(\sqrt{s_0}\). At a given \(\sqrt{s}\), the amount of MPI depends on \(p_\mathrm{T0}\), the parton distribution functions (PDF), and the overlap of the matter distributions (or centrality) of the two colliding hadrons. Smaller values of \(p_\mathrm{T0}\) provide more MPI due to a larger MPI cross section. Table 1 shows the parameters in pythia6 [1] and pythia8 [5] that, together with the selected PDF, determine the energy dependence of MPI. Recently, in herwig++ [6, 7] the same formula has been adopted to provide an energy dependence to their MPI cutoff, which is also shown in Table 1. The QCD MC generators have other parameters that can be adjusted to control the modelling of the properties of the events, and a specified set of such parameters adjusted to fit certain prescribed aspects of the data is referred to as a “tune” [810].

Table 1 Parameters in pythia6 [1], pythia8 [5], and herwig++ [6, 7] MC event generators that, together with some chosen PDF, determine the energy dependence of MPI

In addition to hard-scattering processes, other processes contribute to the inelastic cross section in hadron–hadron collisions: single-diffraction dissociation (SD), double-diffraction dissociation (DD), and central-diffraction (CD). In SD and DD events, one or both beam particles are excited into high-mass color-singlet states (i.e.  into some resonant \({\mathrm {N}}^*\)), which then decay. The SD and DD processes correspond to color-singlet exchanges between the beam hadrons, while CD corresponds to double color-singlet exchange with a diffractive system produced centrally. For non-diffractive processes (ND), color is exchanged, the outgoing remnants are no longer color singlets, and this separation of color generates a multitude of quark–antiquark pairs that are created via vacuum polarization. The sum of all components except SD corresponds to non single-diffraction (NSD) processes.

Minimum bias (MB) is a generic term that refers to events selected by requiring minimal activity within the detector. This selection accepts a large fraction of the overall inelastic cross section. Studies of the UE are often based on MB data, but it should be noted that the dominant particle production mechanisms in MB collisions and in the UE are not exactly the same. On the one hand, the UE is studied in collisions in which a hard 2-to-2 parton-parton scattering has occurred, by analyzing the hadronic activity in different regions of the event relative to the back-to-back azimuthal structure of the hardest particles emitted [11]. On the other hand, MB collisions are often softer and include diffractive interactions that, in the case of pythia, are modelled via a Regge-based approach [12].

The MPI are usually much softer than primary hard scatters, however, occasionally two hard 2-to-2 parton scatters can take place within the same hadron–hadron collision. This is referred to as double-parton scattering (DPS) [1316], and is typically described in terms of an effective cross section parameter, \(\sigma _\mathrm{eff}\), defined as:

$$\begin{aligned} \sigma _\mathrm{AB} = \frac{\sigma _\mathrm{A} \sigma _\mathrm{B}}{\sigma _\mathrm{eff}}, \end{aligned}$$
(3)

where \(\sigma _\mathrm{A}\) and \(\sigma _\mathrm{B}\) are the inclusive cross sections for individual hard scattering processes of generic type A and B, respectively, and \(\sigma _\mathrm{AB}\) is the cross section for producing both scatters in the same hadron–hadron collision. If A and B are indistinguishable, as in four-jet production, a statistical factor of 1 / 2 must be inserted on the right-hand side of Eq. (3). Furthermore, \(\sigma _\mathrm{eff}\) is assumed to be independent of A and B. However, \(\sigma _\mathrm{eff}\) is not a directly observed quantity, but can be calculated from the overlap function of the two transverse profile distributions of the colliding hadrons, as implemented in any given MPI model.

The UE tunes have impact in both soft and hard particle production in a given pp collision. First, about half of the particles produced in a MB collision originate from the hadronization of partons scattered in MPI, and have their differential cross sections in \(p_{\mathrm {T}}\) regulated via Eq. (1), using the same \(p_\mathrm{T0}\) cutoff used to tame the hardest 2-to-2 parton-parton scattering in the event. The tuning of the cross-section regularization affects therefore all (soft and hard) parton-parton scatterings and provides a prediction for the behavior of the ND cross section. Second, the UE tunes parametrize the distribution in the transverse overlap of the colliding protons and thereby the probability of two hard parton-parton scatters that is then used to estimate DPS-sensitive observables.

In this paper, we study the \(\sqrt{s}\) dependence of the UE using recent CDF proton–antiproton data from the Fermilab Tevatron at 0.3, 0.9, and \(1.96\,\text {TeV} \) [11], together with CMS pp data from the CERN LHC at \(\sqrt{s} = 7\,\text {TeV} \) [17]. The 0.3 and \(0.9\,\text {TeV} \) data are from the “Tevatron energy scan” performed just before the Tevatron was shut down. Using the rivet (version 1.9.0) and professor (version 1.3.3) frameworks [18, 19], we construct: (i) new pythia8 (version 8.185) UE tunes using several PDF sets (CTEQ6L1 [20], HERAPDF\(1.5\)LO [21], and NNPDF2.3LO [22, 23]), (ii) new pythia6 (version 6.327) UE tunes (using CTEQ6L1 and HERAPDF\(1.5\)LO), and (iii) a new herwig++ (version 2.7.0) UE tune for CTEQ6L1. The rivet software is a tool for producing predictions of physics quantities obtained from MC event generators. It is used for generating sets of MC predictions with a different choice of parameters related to the UE simulation. The predictions are then included in the professor framework, which parametrizes the generator response and returns the set of tuned parameters that best fits the input measurements.

In addition, we construct several new CMS “DPS tunes” and investigate whether the values of the UE parameters determined from fitting the UE observables in a hard-scattering process are consistent with the values determined from fitting DPS-sensitive observables. The professor software also offers the possibility of extracting “eigentunes”, which provide an estimate of the uncertainties in the fitted parameters. The eigentunes consist of a collection of additional tunes, obtained through the covariance matrix of the data-theory fitting procedure, to determine independent directions in parameter space that provide a specific modification in the goodness of the fit, \(\chi ^2\) (Sect. 2). All of the CMS UE and DPS tunes are provided with eigentunes. In Sect. 4, predictions using the CMS UE tunes are compared to other UE measurements not used in determining the tunes, and we examine how well Drell–Yan, MB, and multijet observables can be predicted using the UE tunes. In Sect. 5, predictions of the new tunes are shown for UE observables at \(13\,\text {TeV} \), together with a comparison to the first MB distribution measured. Section 6 has a brief summary and conclusions. The appendices contain additional comparisons between the pythia6 and herwig++ UE tunes and the data, information about the tune uncertainties, and predictions for some MB and DPS observables at 13\(\,\text {TeV}\).

2 The CMS UE tunes

Previous UE studies have used the charged-particle jet with largest \(p_{\mathrm {T}}\) [24, 25] or a \(\mathrm{Z} \) boson [11, 26] as the leading (i.e. highest \(p_{\mathrm {T}}\)) objects in the event. The CDF and CMS data, used for the tunes, select the charged particle with largest \(p_{\mathrm {T}}\) in the event (\(p_\mathrm{T}^\mathrm{max}\)) as the “leading object”, and use just the charged particles with \(p_\mathrm{T}\!>\!0.5\,\text {GeV} \) and \(|\eta | < 0.8\) to characterize the UE.

On an event-by-event basis, the leading object is used to define regions of pseudorapidity-azimuth (\(\eta \)-\(\phi \)) space. The “toward” region relative to this direction, as indicated in Fig. 1, is defined by \(|\varDelta \phi |<\pi /3\) and \(|\eta | < 0.8\), and the “away” region by \(|\varDelta \phi |>2\pi /3\) and \(|\eta | < 0.8\). The charged-particle and the scalar-\(p_{\mathrm {T}}\) sum densities in the transverse region are calculated as the sum of the contribution in the two regions: “Transverse-1” (\(\pi /3<\varDelta \phi <2\pi /3\), \(|\eta | < 0.8\)) and “Transverse-2” (\(\pi /3<-\varDelta \phi <2\pi /3\), \(|\eta | < 0.8\)), divided by the area in \(\eta \)-\(\phi \) space, \(\varDelta \eta \varDelta \phi = 1.6\times 2\pi /3\). The transverse region is further separated into the “TransMAX” and “TransMIN” regions, also shown in Fig. 1. This defines on an event-by-event basis the regions with more (TransMAX) and fewer (TransMIN) charged particles (\({\mathrm {N}}_\mathrm{ch}\)), or greater (TransMAX) or smaller (TransMIN) scalar-\(p_{\mathrm {T}}\) sums (\(p_\mathrm{T}^\mathrm{sum}\)). The UE particle and \(p_{\mathrm {T}}\) densities are constructed by dividing by the area in \(\eta \)-\(\phi \) space, where the TransMAX and TransMIN regions each have an area of \(\varDelta \eta \varDelta \phi = 1.6\times 2\pi /6\). The transverse density (also referred to as “TransAVE”) is the average of the TransMAX and the TransMIN densities. For events with hard initial- or final-state radiation, the TransMAX region often contains a third jet, but both the TransMAX and TransMIN regions receive contributions from the MPI and beam-beam remnant components. The TransMIN region is very sensitive to the MPI and beam-beam remnant components of the UE, while “TransDIF” (the difference between TransMAX and TransMIN densities) is very sensitive to ISR and FSR [27].

The new UE tunes are determined by fitting UE observables, and using only those parameters that are most sensitive to the UE data. Since it is not possible to tune all parameters of a MC event generator at once, the parameters that affect, for example, the parton shower, the fragmentation, and the intrinsic-parton \(p_{\mathrm {T}}\) are fixed to the values given by an initially established reference tune. The initial reference tunes used for pythia8 are Tune 4C [28] and the Monash Tune [29]. For pythia6, the reference tune is Tune Z2*lep [25], and for herwig++ it is Tune UE-EE-5C [30].

Fig. 1
figure 1

Left Illustration of the azimuthal regions in an event defined by the \(\varDelta \phi \) angle relative to the direction of the leading object [11]. Right Illustration of the topology of a hadron–hadron collision in which a hard parton–parton collision has occurred, and the leading object is taken to be the charged particle of largest \(p_{\mathrm {T}}\) in the event, \(p_\mathrm{T}^\mathrm{max}\)

2.1 The PYTHIA8 UE tunes

Taking as the reference tune the set of parameters of pythia8 Tune \(4\)C [28], we construct two new UE tunes, one using CTEQ6L1 (CUETP\(8\)S\(1\)-CTEQ\(6\)L1) and one using HERAPDF\(1.5\)LO (CUETP\(8\)S\(1\)-HERAPDF1.5LO). CUET (read as “cute”) stands for “CMS UE tune”, and P8S1 stands for pythia8 “Set 1”.

The tunes are extracted by varying the four parameters in Table 2 in fits to the TransMAX and TransMIN charged-particle and \(p_\mathrm{T}^\mathrm{sum}\) densities at three energies, for \( \mathrm {p}\overline{\mathrm{p}} \) collisions at \(\sqrt{s} = 0.9\) and 1.96, and \( \mathrm {p}\mathrm {p}\) collisions at \(7\,\text {TeV} \). The measurements of TransAVE and TransDIF densities are not included in the fit, since they can be constructed from TransMAX and TransMIN. The new tunes use an exponentially-falling matter-overlap function between the two colliding protons of the form exp(−b expPow), with b being the impact parameter of the collision. The parameters that are varied are expPow, the MPI energy-dependence parameters (Table 1) and the range, i.e. the probability, of color reconnection (CR). A small (large) value of the final-state CR parameter tends to increase (reduce) the final particle multiplicities. In pythia8, unlike in pythia6, only one parameter determines the amount of CR, which includes a \(p_{\mathrm {T}}\) dependence, as defined in Ref. [5].

The generated inelastic events include ND and diffractive (DD\(+\)SD\(+\)CD) contributions, although the UE observables used to determine the tunes are sensitive to single-diffraction dissociation, central-diffraction, and double-diffraction dissociation only at very small \(p_\mathrm{T}^\mathrm{max}\) values (e.g. \(p_\mathrm{T}^\mathrm{max}<1.5\,\text {GeV} \)). The ND component dominates for \(p_\mathrm{T}^\mathrm{max}\) values greater than \({\approx } 2.0\,\text {GeV} \), since the cross section of the diffractive components rapidly decreases as a function of \(\hat{p}_\mathrm{T}\). The fit is performed by minimizing the \(\chi ^2\) function:

$$\begin{aligned} \chi ^2(p)=\sum _{i}\frac{(f^{i}(p)-R_{i})^2}{\varDelta _{i}^2}, \end{aligned}$$
(4)

where the sum runs over each bin i of every observable. The \(f^{i}(p)\) functions correspond to the interpolated MC response for the simulated observables as a function of the parameter vector p, \(R_i\) is the value of the measured observable in bin i, and \(\varDelta _i\) is the total experimental uncertainty of \(R_i\). We do not use the Tevatron data at \(\sqrt{s}=300\,\text {GeV} \), as we are unable to obtain an acceptable \(\chi ^2\) in a fit of the four parameters in Table 2. The \(\chi ^2\) per degree of freedom (dof) listed in Table 2 refers to the quantity \(\chi ^2(p)\) in Eq. (4), divided by the number of dof in the fit. The eigentunes (Appendix A) correspond to the tunes in which the changes in the \(\chi ^2\) (\(\varDelta \chi ^2\)) of the fit relative to the best-fit value equals the \(\chi ^2\) value obtained in the tune, i.e. \(\varDelta \chi ^2\) = \(\chi ^2\). For both tunes in Table 2, the fit quality is very good, with \(\chi ^2\)/dof values very close to 1.

The contribution from CR changes in the two new tunes; it is large for the HERAPDF1.5LO and small for the CTEQ6L1 PDF. This is a result of the shape of the parton densities at small fractional momenta x, which is different for the two PDF sets. While the parameter \(p_\mathrm{T0}^\mathrm{ref}\) in Eq. (2) stays relatively constant between Tune \(4\)C and the new tunes, the energy dependence \(\epsilon \) tends to increase in the new tunes, as do the matter-overlap profile functions.

Table 2 The pythia8 parameters, tuning range, Tune \(4\)C values [28], and best-fit values for CUETP\(8\)S\(1\)-CTEQ\(6\)L1 and CUETP\(8\)S\(1\)-HERAPDF1.5LO, obtained from fits to the TransMAX and TransMIN charged-particle and \(p_\mathrm{T}^\mathrm{sum}\) densities, as defined by the leading charged-particle \(p_\mathrm{T}^\mathrm{max}\)at \(\sqrt{s} = 0.9\), 1.96, and \(7\,\text {TeV} \). The \(\sqrt{s}=300\,\text {GeV} \) data are excluded from the fit

The pythia8 Monash Tune [29] combines updated fragmentation parameters with the NNPDF2.3LO PDF.

The NNPDF2.3LO PDF has a gluon distribution at small x that is different compared to CTEQ6L1 and HERAPDF\(1.5\)LO, and this affects predictions in the forward region of hadron–hadron collisions. Tunes using the NNPDF2.3LO PDF provide a more consistent description of the UE and MB observables in both the central and forward regions, than tunes using other PDF.

A new pythia8 tune CUETP\(8\)M\(1\) (labeled with M for Monash) is constructed using the parameters of the Monash Tune and fitting the two MPI energy-dependence parameters of Table 1 to UE data at \(\sqrt{s} = 0.9\), 1.96, and \(7\,\text {TeV} \). Varying the CR range and the exponential slope of the matter-overlap function freely in the minimization of the \(\chi ^2\) leads to suboptimal best-fit values. The CR range is therefore fixed to the value of the Monash Tune, and the exponential slope of the matter-overlap function expPow is set to 1.6, which is similar to the value determined in CUETP8S1-CTEQ6L1. The best-fit values of the two tuned parameters are shown in Table 3. Again, we exclude the \(300\,\text {GeV} \) data, since we are unable to get a good \(\chi ^2\) in the fit. The parameters obtained for CUETP\(8\)M\(1\) differ slightly from the ones of the Monash Tune. The obtained energy-dependence parameter \(\epsilon \) is larger, while a very similar value is obtained for \(p_\mathrm{T0}^\mathrm{ref}\).

Table 3 The pythia8 parameters, tuning range, Monash values [29], and best-fit values for CUETP\(8\)M\(1\), obtained from fits to the TransMAX and TransMIN charged-particle and \(p_\mathrm{T}^\mathrm{sum}\) densities, as defined by the leading charged-particle \(p_\mathrm{T}^\mathrm{max}\)at \(\sqrt{s} = 0.9\), 1.96, and \(7\,\text {TeV} \). The \(\sqrt{s}=300\,\text {GeV} \) data are excluded from the fit

Figures 2, 3, 4 and 5 show the CDF data at 0.3, 0.9, and \(1.96\,\text {TeV} \), and the CMS data at \(7\,\text {TeV} \) for charged-particle and \(p_\mathrm{T}^\mathrm{sum}\) densities in the TransMIN and TransMAX regions as a function of \(p_\mathrm{T}^\mathrm{max}\), compared to predictions obtained with the pythia8 Tune \(4\)C and with the new CMS tunes: CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\). Predictions from the new tunes cannot reproduce the \(\sqrt{s}=300\,\text {GeV} \) data, but describe very well the data at the higher \(\sqrt{s} = 0.9\), 1.96, and \(7\,\text {TeV} \). In particular, the description provided by the new tunes significantly improves relative to the old Tune \(4\)C, which is likely due to the better choice of parameters used in the MPI energy dependence and the extraction of the CR in the retuning.

Fig. 2
figure 2

CDF data at \(\sqrt{s}=300\,\text {GeV} \) [11] on particle (top) and \(p_\mathrm{T}^\mathrm{sum}\) densities (bottom) for charged particles with \(p_\mathrm{T}\!>\!0.5\,\text {GeV} \) and \(|\eta |\!<\!0.8\) in the TransMIN (left) and TransMAX (right) regions as defined by the leading charged particle, as a function of the transverse momentum of the leading charged-particle \(p_\mathrm{T}^\mathrm{max}\). The data are compared to pythia8 Tune \(4\)C, CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\). The ratios of MC events to data are given below each panel. The data at \(\sqrt{s}=300\,\text {GeV} \) are not used in determining these tunes. The green bands in the ratios represent the total experimental uncertainties

Fig. 3
figure 3

CDF data at \(\sqrt{s}=900\,\text {GeV} \) [11] on particle (top) and \(p_\mathrm{T}^\mathrm{sum}\) densities (bottom) for charged particles with \(p_\mathrm{T}\!>\!0.5\,\text {GeV} \) and \(|\eta |\!<\!0.8\) in the TransMIN (left) and TransMAX (right) regions as defined by the leading charged particle, as a function of the transverse momentum of the leading charged-particle \(p_\mathrm{T}^\mathrm{max}\). The data are compared to pythia8 Tune \(4\)C, CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\). The ratios of MC events to data are given below each panel. The green bands in the ratios represent the total experimental uncertainties

Fig. 4
figure 4

CDF data at \(\sqrt{s}=1.96\,\text {TeV} \) [11] on particle (top) and \(p_\mathrm{T}^\mathrm{sum}\) densities (bottom) for charged particles with \(p_\mathrm{T}\!>\!0.5\,\text {GeV} \) and \(|\eta |\!<\!0.8\) in the TransMIN (left) and TransMAX (right) regions as defined by the leading charged particle, as a function of the transverse momentum of the leading charged-particle \(p_\mathrm{T}^\mathrm{max}\). The data are compared to pythia8 Tune \(4\)C, CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\). The ratios of MC events to data are given below each panel. The green bands in the ratios represent the total experimental uncertainties

Fig. 5
figure 5

CMS data at \(\sqrt{s}=7\,\text {TeV} \) [17] on particle (top) and \(p_\mathrm{T}^\mathrm{sum}\) densities (bottom) for charged particles with \(p_\mathrm{T}\!>\!0.5\,\text {GeV} \) and \(|\eta |\!<\!0.8\) in the TransMIN (left) and TransMAX (right) regions as defined by the leading charged particle, as a function of the transverse momentum of the leading charged-particle \(p_\mathrm{T}^\mathrm{max}\). The data are compared to pythia8 Tune \(4\)C, and CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\). The ratios of MC events to data are given below each panel. The green bands in the ratios represent the total experimental uncertainties

2.2 The PYTHIA6 UE tunes

The pythia6 Tune Z\(2^*\)lep [25] uses the improved fragmentation parameters from fits to the LEP e\(^+\)e\(^-\) data [31], and a double-Gaussian matter profile for the colliding protons but corresponds to an outdated CMS UE tune. It was constructed by fitting the CMS charged-particle jet UE data at 0.9 and \(7\,\text {TeV} \) [24] using data on the TransAVE charged-particle and \(p_\mathrm{T}^\mathrm{sum}\) densities, since data on TransMAX, TransMIN, and TransDIF were not available at that time.

Starting with Tune Z\(2^*\)lep parameters, two new pythia6 UE tunes are constructed, one using CTEQ6L1 (CUETP\(6\)S\(1\)-CTEQ\(6\)L1) and one using HERAPDF\(1.5\)LO (CUETP\(6\)S\(1\)-HERAPDF1.5LO), with P6S1 standing for pythia6 “Set 1”. The tunes are constructed by fitting the five parameters shown in Table 4 to the TransMAX and TransMIN charged-particle and \(p_\mathrm{T}^\mathrm{sum}\) densities at \(\sqrt{s} = 0.3\), 0.9, 1.96, and \(7\,\text {TeV} \). In addition to varying the MPI energy-dependence parameters (Table 1), we also vary the core-matter fraction PARP(83), which parametrizes the amount of matter contained within the radius of the proton core, the CR strength PARP(78), and the CR suppression PARP(77). The PARP(78) parameter reflects the probability for a given string to retain its color history, and therefore does not change the color and other string pieces, while the PARP(77) parameter introduces a \(p_{\mathrm {T}}\) dependence on the CR probability [1].

Inelastic events (ND\(+\)DD\(+\)SD\(+\)CD) are generated with pythia6. The best-fit values of the five parameters are shown in Table 4. The matter-core fraction is quite different in the two new pythia6 tunes. This is due to the fact that this parameter is very sensitive to the behaviour of the PDF at small x. Predictions obtained with pythia6 Tune Z\(2^*\)lep , CUETP6S1-CTEQ6L1 and CUETP6S1-HERAPDF1.5LO are compared in Appendix B to the UE data. The new pythia6 tunes significantly improve the description of the UE data relative to pythia6 Tune Z\(2^*\)lep at all considered energies, due to the better choice of parameters governing the MPI energy dependence.

Table 4 The pythia6 parameters, tuning range, Tune Z\(2^*\)lep values [31], and best-fit values for CUETP\(6\)S\(1\)-CTEQ\(6\)L1 and CUETP\(6\)S\(1\)-HERAPDF1.5LO, obtained from fits to the TransMAX and TransMIN charged-particle and \(p_\mathrm{T}^\mathrm{sum}\) densities as defined by the \(p_\mathrm{T}^\mathrm{max}\) of the leading charged particle at \(\sqrt{s} = 0.3\) , 0.9, 1.96, and \(7\,\text {TeV} \)

2.3 The HERWIG++ UE tunes

Starting with the parameters of herwig++ Tune UE-EE-5C [30], we construct a new herwig++ UE tune, CUETHppS\(1\), where Hpp stands for herwig++. This tune is obtained by varying the four parameters shown in Table 5 in the fit to TransMAX and TransMIN charged-particle and \(p_\mathrm{T}^\mathrm{sum}\) densities at the four \(\sqrt{s} = 0.3\), 0.9, 1.96, and \(7\,\text {TeV} \). We set the MPI cutoff \(p_\mathrm{T0}\) and the reference energy \(\sqrt{s_0}\) to the Tune UE-EE-5C values, and vary the MPI c.m. energy extrapolation parameter in Table 1. We also vary the inverse radius that determines the matter overlap and the range of CR. The CR model in herwig++ is defined by two parameters, one (colourDisrupt) ruling the color structure of soft interactions (\(p_{\mathrm {T}}\) \(<\) \(p_\mathrm{T0}\)), and one (ReconnectionProbability) giving the probability of CR without a \(p_{\mathrm {T}}\) dependence for color strings. We include all four center-of-mass energies, although at each energy we exclude the first two \(p_\mathrm{T}^\mathrm{max}\) bins. These first bins, e.g. for \(p_\mathrm{T}^\mathrm{max}<1.5\,\text {GeV} \), are sensitive to single-diffraction dissociation, central-diffraction, and double-diffraction dissociation, but herwig++ contains only the ND component.

In Table 5, the parameters of the new CUETHppS1 are listed and compared to those from Tune UE-EE-5C. The parameters of the two tunes are very similar. The \(\chi ^2\)/dof, also indicated in Table 5, is found to be \({\approx } 0.46\), which is smaller than the value obtained for other CMS UE tunes. This is due to the fact that the first two bins as a function of \(p_\mathrm{T}^\mathrm{max}\), which have much smaller statistical uncertainties than the higher-\(p_\mathrm{T}^\mathrm{max}\) bins, are excluded from the fit because they cannot be described by any reasonable fit-values. In Appendix C, predictions obtained with herwig++ Tune UE-EE-5C and CUETHppS1 are compared to the UE data. The two tunes are both able to reproduce the UE data at all energies. With the new CUETHppS1 tune, uncertainties can be estimated using the eigentunes (Appendix A).

Table 5 The herwig++ parameters, tuning range, Tune UE-EE-5C values [30], and best-fit values for CUETHppS\(1\), obtained from a fit to the TransMAX and TransMIN charged-particle and \(p_\mathrm{T}^\mathrm{sum}\) densities as a function of the leading charged-particle \(p_\mathrm{T}^\mathrm{max}\)at \(\sqrt{s} = 0.3\) , 0.9, 1.96, and \(7\,\text {TeV} \)

In conclusion, both herwig++ tunes, as well as the new CMS pythia6 UE tunes reproduce the UE data at all four \(\sqrt{s}\). The pythia8 UE tunes, however, do not describe well the data at \(\sqrt{s}=300\,\text {GeV} \), which may be related to the modelling of the proton–proton overlap function. The pythia6 Tune Z\(2^*\)lep, and the new CMS UE tunes use a double-Gaussian matter distribution, while all the pythia8 UE tunes use a single exponential matter overlap. The herwig++ tune, on the other hand, uses a matter-overlap function that is related to the Fourier transform of the electromagnetic form factor with \(\mu ^2\) [7] playing the role of an effective inverse proton radius (i.e. the InvRadius parameter in Table 5). However, predictions from a tune performed with pythia8 using a double-Gaussian matter distribution were not able to improve the quality of the fit as a fit obtained without interleaved FSR in the simulation of the UE (as it is implemented in pythia6) did not show any improvement. Further investigations are needed to resolve this issue.

3 The CMS DPS tunes

Traditionally, \(\sigma _\mathrm{eff}\) is determined by fitting the DPS-sensitive observables with two templates [3236] that are often based on distributions obtained from QCD MC models. One template is constructed with no DPS, i.e. just single parton scattering (SPS), while the other represents DPS production. This determines \(\sigma _\mathrm{eff}\) from the relative amounts of SPS and DPS contributions needed to fit the data. Here we use an alternative method that does not require construction of templates from MC samples. Instead, we fit the DPS-sensitive observables directly and then calculate the resulting \(\sigma _\mathrm{eff}\) from the model. For example, in pythia8, the value of \(\sigma _\mathrm{eff}\) is calculated by multiplying the ND cross section by an enhancement or a depletion factor, which expresses the dependence of DPS events on the collision impact parameter. As expected, more central collisions have a higher probability of a second hard scattering than peripheral collisions. The enhancement/depletion factors depend on the UE parameters, namely, on the parameters that characterize the matter-overlap function of the two protons, which for bProfile \(=3\) is determined by the exponential parameter expPow, on the MPI regulator \(p_\mathrm{T0}\) in Eq. (2), and the range of the CR. pythia8 Tune \(4\)C gives \(\sigma _\mathrm{eff}\) \(\approx \) 30.3 mb at \(\sqrt{s}=7\,\text {TeV} \).

In Sect. 2, we determined the MPI parameters by fitting UE data. Here we determine the MPI parameters by fitting to observables which involve correlations among produced objects in hadron–hadron collisions that are sensitive to DPS. Two such observables used in the fit, \(\varDelta S\) and \(\varDelta ^\mathrm{rel}p_\mathrm{T}\), are defined as follows:

$$\begin{aligned}&\varDelta \mathrm{S}=\arccos \left( \frac{\vec {p}_{\text {T}}(\text {object}_1)\cdot \vec {p}_{\text {T}} (\text {object}_2)}{|\vec {p}_T(\text {object}_1)| \times |\vec {p}_{\text {T}}(\text {object}_2)|}\right) ,\end{aligned}$$
(5)
$$\begin{aligned}&\varDelta ^\mathrm{rel}p_\mathrm{T} = \frac{|\vec {p}_\mathrm{T}^{\ \mathrm jet_1}+\vec {p}_\mathrm{T}^{\ \mathrm jet_2}|}{|\vec {p}_\mathrm{T}^{\ \mathrm jet_1}|+|\vec {p}_\mathrm{T}^{\ \mathrm jet_2}|}, \end{aligned}$$
(6)

where, for \(\mathrm {W}\)+dijet production, object\(_1\) is the \(\mathrm {W}\) boson and object\(_2\) is the dijet system. For four-jet production, object\(_1\) is the hard-jet pair and object\(_2\) is the soft-jet pair. For \(\varDelta ^\mathrm{rel}p_\mathrm{T}\) in \(\mathrm {W}\)+dijet production, jet\(_1\) and jet\(_2\) are the two jets of the dijet system, while in four-jet production, jet\(_1\) and jet\(_2\) refer to the two softer jets.

The pythia8 UE parameters are fitted to the DPS-sensitive observables measured by CMS in \(\mathrm {W}\)+dijet [36] and in four-jet production [37]. After extracting the MPI parameters, the value of \(\sigma _\mathrm{eff}\) in Eq. (3) can be calculated from the underlying MPI model. In pythia8, \(\sigma _\mathrm{eff}\) depends primarily on the matter-overlap function and, to a lesser extent, on the value of \(p_\mathrm{T0}\) in Eq. (2), and the range of the CR. We obtain two separate tunes for each channel: in the first one, we vary just the matter-overlap parameter expPow, to which the \(\sigma _\mathrm{eff}\) value is most sensitive, and in the second one, the whole set of parameters is varied. These two tunes allow to check whether the value of \(\sigma _\mathrm{eff}\) is stable relative to the choice of parameters.

The \(\mathrm {W}\)+dijet and the four-jet channels are fitted separately. The fit to DPS-sensitive observables in the \(\mathrm {W}\)+dijet channel gives a new determination of \(\sigma _\mathrm{eff}\) which can be compared to the value measured through the template method in the same final state [36]. Fitting the same way to the observables in the four-jet final state provides an estimate of \(\sigma _\mathrm{eff}\) for this channel.

3.1 Double-parton scattering in W+dijet production

To study the dependence of the DPS-sensitive observables on MPI parameters, we construct two \(\mathrm {W}\)+dijet DPS tunes, starting from the parameters of pythia8 Tune \(4\)C. In a partial tune only the parameter of the exponential distribution expPow is varied, and in a full tune all four parameters in Table 6 are varied. In a comparison of models with \(\mathrm {W}\)+dijet events [36], it was shown that higher-order SPS contributions (not present in pythia) fill a similar region of phase-space as the DPS signal. When such higher-order SPS diagrams are neglected, the measured DPS contribution to the \(\mathrm {W}\)+dijet channel can be overestimated (i.e. \(\sigma _\mathrm{eff}\) underestimated). We therefore interface the LO matrix elements (ME) generated by MadGraph 5 (version 1.5.14) [38] with pythia8, and tune to the normalized distributions of the correlation observables in Eqs. (5) and (6). For this study, we produce MadGraph parton-level events with a \(\mathrm {W}\) boson and up to four partons in the final state. The cross section is calculated using the CTEQ6L1 PDF with a matching scale for ME and parton shower (PS) jets set to 20 GeV. (In Sect. 4, we show that the CMS UE tunes can be interfaced to higher-order ME generators without additional tuning of MPI parameters). Figure 6 shows the CMS data [36] for the observables \(\varDelta \)S and \(\varDelta ^\mathrm{rel}p_\mathrm{T}\) measured in \(\mathrm {W}\)+dijet production, compared to predictions from MadGraph interfaced to pythia8 Tune \(4\)C, to Tune \(4\)C with no MPI, to the partial CDPSTP\(8\)S\(1\)-Wj, as well as to the full CDPSTP\(8\)S\(2\)-Wj (CDPST stands for “CMS DPS tune”). Table 6 gives the best-fit parameters and the resulting \(\sigma _\mathrm{eff}\) values at \(\sqrt{s}=7\,\text {TeV} \). The uncertainties quoted for \(\sigma _\mathrm{eff}\) are computed from the uncertainties of the fitted parameters given by the eigentunes. For Tune \(4\)C, the uncertainty in \(\sigma _\mathrm{eff}\) is not provided since no eigentunes are available for that tune. The resulting values of \(\sigma _\mathrm{eff}\) are compatible with the value measured by CMS using the template method of \(\sigma _\mathrm{eff} = 20.6\pm 0.8 \,\text {(stat)} \pm 6.6 \,\text {(syst)} \) \(\text {\,mb}\)  [36].

Table 6 The pythia8 parameters, tuning ranges, Tune \(4\)C values [28] and best-fit values of CDPSTP\(8\)S\(1\)-Wj and CDPSTP\(8\)S\(2\)-Wj, obtained from fits to DPS observables in \(\mathrm {W}\)+dijet production with the MadGraph event generator interfaced to pythia8. Also shown are the predicted values of \(\sigma _\mathrm{eff}\) at \(\sqrt{s}=7\,\text {TeV} \), and the uncertainties obtained from the eigentunes
Fig. 6
figure 6

CMS data at \(\sqrt{s}=7\,\text {TeV} \) [36] for the normalized distributions of the correlation observables \(\varDelta \)S (left), and \(\varDelta ^\mathrm{rel}p_\mathrm{T}\) (right) in the \(\mathrm {W}\)+dijet channel, compared to MadGraph (MG) interfaced to: pythia8 Tune \(4\)C, Tune \(4\)C with no MPI, and the CMS pythia8 DPS partial CDPSTP\(8\)S\(1\)-Wj (top); and CDPSTP\(8\)S\(1\)-Wj, and CDPSTP\(8\)S\(2\)-Wj (bottom). The bottom panels of each plot show the ratios of these tunes to the data, and the green bands around unity represent the total experimental uncertainty

3.2 Double-parton scattering in four-jet production

Starting from the parameters of pythia8 Tune \(4\)C, we construct two different four-jet DPS tunes. As in the \(\mathrm {W}\)+dijet channel, in the partial tune just the exponential-dependence parameter, expPow, while in the full tune all four parameters of Table 7 are varied. We obtain a good fit to the four-jet data without including higher-order ME contributions. However, we also obtain a good fit when higher-order (real) ME terms are generated with MadGraph. In Figs. 7 and 8 the correlation observables \(\varDelta \)S and \(\varDelta ^\mathrm{rel}p_\mathrm{T}\) in four-jet production [37] are compared to predictions obtained with pythia8 Tune \(4\)C, Tune \(4\)C without MPI, CDPSTP\(8\)S\(1\)-\(4\)j, CDPSTP\(8\)S\(2\)-\(4\)j, and MadGraph interfaced to CDPSTP\(8\)S\(2\)-\(4\)j. Table 7 gives the best-fit parameters and the resulting \(\sigma _\mathrm{eff}\) values. The values of \(\sigma _\mathrm{eff}\) extracted from the CMS pythia8 DPS tunes give the first determination of \(\sigma _\mathrm{eff}\) in four-jet production at \(\sqrt{s}=7\,\text {TeV} \). The uncertainties quoted for \(\sigma _\mathrm{eff}\) are obtained from the eigentunes.

Table 7 The pythia8 parameters, tuning ranges, Tune \(4\)C values [28] and best-fit values of CDPSTP\(8\)S\(1\)-\(4\)j and CDPSTP\(8\)S\(2\)-\(4\)j, obtained from fits to DPS observables in four-jet production. Also shown are the predicted values of \(\sigma _\mathrm{eff}\) at \(\sqrt{s}=7\,\text {TeV} \), and the uncertainties obtained from the eigentunes
Fig. 7
figure 7

Distributions of the correlation observables \(\varDelta \)S (left) and \(\varDelta ^\mathrm{rel}p_\mathrm{T}\) (right) measured in four-jet production at \(\sqrt{s}=7\,\text {TeV} \) [37] compared to pythia8 Tune 4C, Tune 4C with no MPI, and CDPSTP8S1-4j. The bottom panels of each plot show the ratios of these predictions to the data, and the green bands around unity represent the total experimental uncertainty

Fig. 8
figure 8

Distributions in the correlation observables \(\varDelta \)S (top) and \(\varDelta ^\mathrm{rel}p_\mathrm{T}\) (bottom) measured in four-jet production at \(\sqrt{s}=7\,\text {TeV} \) [37], compared to predictions of pythia8 using CDPSTP\(8\)S\(2\)-\(4\)j and of MadGraph (MG) interfaced to pythia8 using CDPSTP\(8\)S\(2\)-\(4\)j (left) and pythia8 using CUETP\(8\)M\(1\) and herwig++ with CUETHppS\(1\) (right). Also shown are the ratios of the predictions to the data. Predictions for CUETP\(8\)M\(1\) (right) are shown with an error band corresponding to the total uncertainty obtained from the eigentunes (Appendix A). The green bands around unity represent the total experimental uncertainty

4 Validation of CMS tunes

Here we discuss the compatibility of the UE and DPS tunes. In addition, we compare the CMS UE tunes with UE data that have not been used in the fits, and we examine how well Drell–Yan and MB observables can be predicted from MC simulations using the UE tunes. We also show that the CMS UE tunes can be interfaced to higher-order ME generators without additional tuning of the MPI parameters.

4.1 Compatibility of UE and DPS tunes

The values of \(\sigma _\mathrm{eff}\) obtained from simulations applying the CMS pythia8 UE and DPS tunes at \(\sqrt{s}=7\,\text {TeV} \) and \(\sqrt{s}=13\,\text {TeV} \) are listed in Table 8. The uncertainties, obtained from eigentunes are also quoted in Table 8. At \(\sqrt{s}=7\,\text {TeV} \), the CMS DPS tunes give values of \(\sigma _\mathrm{eff}\) \(\approx \) 20\(\text {\,mb}\), while the CMS pythia8 UE tunes give slightly higher values in the range 26–29 mb as shown in Figs. 8 and  9. Figure 8 shows the CMS DPS-sensitive data for four-jet production at \(\sqrt{s}=7\,\text {TeV} \) compared to predictions using CDPSTP\(8\)S\(2\)-\(4\)j, CUETP\(8\)M\(1\), and CUETHppS\(1\). Figure 9 shows ATLAS UE data at \(\sqrt{s}=7\,\text {TeV} \) [39] compared to predictions obtained with various tunes: CDPSTP\(8\)S\(2\)-\(4\)j with uncertainty bands, CUETP\(6\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, CUETP\(8\)M\(1\), and CUETHppS\(1\). Predictions from pythia8 using CUETP\(8\)M\(1\) describe reasonably well the DPS observables, but do not fit them as well as predictions using the DPS tunes. On the other hand, predictions using CDPSTP\(8\)S\(2\)-\(4\)j do not fit the UE data as well as the UE tunes do.

Table 8 Values of \(\sigma _\mathrm{eff}\) at \(\sqrt{s}=7\,\text {TeV} \) and \(13\,\text {TeV} \) for CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\), CUETHppS\(1\), and for CDPSTP\(8\)S\(1\)-\(4\)j and CDPSTP\(8\)S\(2\)-\(4\)j. At \(\sqrt{s}=7\,\text {TeV} \), also shown are the uncertainties in \(\sigma _\mathrm{eff}\) obtained from the eigentunes
Fig. 9
figure 9

ATLAS data at \(\sqrt{s}=7\,\text {TeV} \) [39] for charged-particle (left) and \(p_\mathrm{T}^\mathrm{sum}\) densities (right) with \(p_\mathrm{T}\!>\!0.5\,\text {GeV} \) and \(|\eta |\!<\!2.0\) in the transverse (TransAVE) region compared to predictions of pythia8 using CDPSTP\(8\)S\(2\)-\(4\)j (left) and CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\), plus herwig++ using CUETHppS\(1\) (right). The predictions of CDPSTP\(8\)S\(2\)-\(4\)j are shown with an error band corresponding to the total uncertainty obtained from the eigentunes (Appendix A). The bottom panels of each plot show the ratios of these predictions to the data, and the green bands around unity represent the total experimental uncertainty

As discussed previously, the pythia8 tunes use a single exponential matter-overlap function, while the herwig++ tune uses a matter-overlap function that is related to the Fourier transform of the electromagnetic form factor. The CUETHppS\(1\) gives a value of \(\sigma _\mathrm{eff}\) \(\approx \) 15\(\text {\,mb}\), while UE and DPS tunes give higher values of \(\sigma _\mathrm{eff}\). It should be noted that \(\sigma _\mathrm{eff}\) is a parton-level observable and its importance is not in the modelled value of \(\sigma _\mathrm{eff}\), but in what is learned about the transverse proton profile (and its energy evolution), and how well the models describe the DPS-sensitive observables. As can be seen in Fig. 8, predictions using CUETP\(8\)M\(1\) describe the DPS-sensitive observables better than CUETHppS\(1\), but not quite as well as the DPS tunes. We performed a simultaneous pythia8 tune that included both the UE data and DPS-sensitive observables, however, the quality of the resulting fit was poor. This confirms the difficulty of describing soft and hard MPI within the current pythia and herwig++ frameworks. Recent studies [40, 41] suggest the need for introducing parton correlation effects in the MPI framework in order to achieve a consistent description of both the UE and DPS observables.

4.2 Comparisons with other UE measurements

Figure 10 shows charged particle and \(p_\mathrm{T}^\mathrm{sum}\) densities [24, 42] at \(\sqrt{s} = 0.9\), 2.76, and \(7\,\text {TeV} \) with \(p_\mathrm{T}\!>\!0.5\,\text {GeV} \) and \(|\eta |\!<\!2.0\) in the TransAVE region, as defined by the leading jet reconstructed by using just the charged particles (also called “leading track-jet”) compared to predictions using the CMS UE tunes. The CMS UE tunes describe quite well the UE measured using the leading charged particle as well as the leading charged-particle jet.

Tunes obtained from fits to UE data and combined with higher-order ME calculations [43] can also be cross-checked against the data. The CMS UE tunes can be interfaced to higher-order ME generators without spoiling their good description of the UE. In Fig. 11, the charged-particle and \(p_\mathrm{T}^\mathrm{sum}\) densities in the TransMIN and TransMAX regions as a function of \(p_\mathrm{T}^\mathrm{max}\), are compared to predictions obtained with MadGraph and powheg  [44, 45] interfaced to pythia8 using CUETP\(8\)S\(1\)-CTEQ\(6\)L1 and CUETP\(8\)M\(1\). In MadGraph, up to four partons are simulated in the final state. The cross section is calculated with the CTEQ6L1 PDF. The ME/PS matching scale is taken to be 10\(\,\text {GeV}\). The powheg predictions are based on next-to-leading-order (NLO) dijet using the CT10nlo PDF [46] interfaced to pythia8 based on CUETP\(8\)M\(1\), and HERAPDF1.5NLO [21] interfaced to the pythia8 using CUETP\(8\)S\(1\)-HERAPDF1.5LO.

Fig. 10
figure 10

CMS data on charged-particle (left) and \(p_\mathrm{T}^\mathrm{sum}\) (right) densities at \(\sqrt{s}\) = 0.9 [24] (top), 2.76 [42] (middle), and \(7\,\text {TeV} \) [24] (bottom) with \(p_\mathrm{T}\!>\!0.5\,\text {GeV} \) and \(|\eta |\!<\!2.0\) in the transverse (TransAVE) region as defined by the leading charged-particle jet, as a function of the transverse momentum of the leading charged-particle jet. The data are compared to predictions of pythia6 using CUETP\(6\)S\(1\)-CTEQ\(6\)L1, pythia8 using CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\), and herwig++ using CUETHppS\(1\). The bottom panels of each plot show the ratios of these predictions to the data, and the green bands around unity represent the total experimental uncertainty

The poor agreement below \(p_\mathrm{T}^\mathrm{max}\) \(=5\,\text {GeV} \) in Fig. 11 is not relevant as the minimum \(\hat{p}_\mathrm{T}\) for MadGraph and powheg is \(5\,\text {GeV} \). The agreement with the UE data in the plateau region of \(p_\mathrm{T}^\mathrm{max}\) \(> 5\,\text {GeV} \) is good. All these figures show that CMS UE tunes interfaced to higher-order ME generators do not spoil their good description of the UE data.

Fig. 11
figure 11

CMS data at \(\sqrt{s}=7\,\text {TeV} \)  [17] for particle (top) and \(p_\mathrm{T}^\mathrm{sum}\) densities (bottom) for charged particles with \(p_\mathrm{T}\!>\!0.5\,\text {GeV} \) and \(|\eta |\!<\!0.8\) in the TransMIN (left) and TransMAX (right) regions, as defined by the leading charged particle, as a function of the transverse momentum of the leading charged-particle \(p_\mathrm{T}^\mathrm{max}\). The data are compared to MadGraph (MG), interfaced to pythia8 using CUETP\(8\)S\(1\)-CTEQ\(6\)L1 and CUETP\(8\)M\(1\), and to powheg (PH), interfaced to pythia8 using CUETP\(8\)S\(1\)-HERAPDF1.5LO and CUETP\(8\)M\(1\). The bottom panels of each plot show the ratios of these predictions to the data, and the green bands around unity represent the total experimental uncertainty

4.3 Predicting MB observables

The UE is studied in events containing a hard scatter, whereas most of the MB collisions are softer and can include diffractive scatterings. It is however interesting to see how well predictions based on the CMS UE tunes can describe the properties of MB distributions. Figure 12 shows predictions using CMS UE tunes for the ALICE [47] and TOTEM data [48] at \(\sqrt{s}=7\,\text {TeV} \) for the charged-particle pseudorapidity distribution, \(\mathrm{d}{\mathrm {N}}_{\text {ch}}/\mathrm{d}\eta \), and for \(\mathrm{d}E/\mathrm{d}\eta \) [49] at \(\sqrt{s}=7\,\text {TeV} \). These observables are sensitive to single-diffraction dissociation, central-diffraction, and double-diffraction dissociation, which are modelled in pythia. Since herwig++ does not include a model for single-diffraction dissociation, central-diffraction, and double-diffraction dissociation, we do not show it here. Figure 13 shows predictions using the CMS UE tunes for the combined CMS\(+\)TOTEM data at \(\sqrt{s}=8\,\text {TeV} \) [50] for the charged-particle pseudorapidity distribution, \(\mathrm{d}{\mathrm {N}}_{\text {ch}}/\mathrm{d}\eta \), for inelastic, non single-diffraction-enhanced, and single-diffraction-enhanced proton–proton collisions.

Fig. 12
figure 12

ALICE data at \(\sqrt{s}=7\,\text {TeV} \) [47] for the charged-particle pseudorapidity distribution, \(\mathrm{d}{\mathrm {N}}_{\text {ch}}/ \mathrm{d}\eta \), in inclusive inelastic \(\mathrm {p}\mathrm {p}\) collisions (top left). TOTEM data at \(\sqrt{s}=7\,\text {TeV} \) [48] for the charged-particle pseudorapidity distribution, \(\mathrm{d}{\mathrm {N}}_{\text {ch}}/ \mathrm{d}\eta \), in inclusive inelastic pp collisions (\(p_\mathrm{T}>40\,\text {MeV} \), \({\mathrm {N}}_\mathrm{chg}\ge 1\)) (top right). CMS data at \(\sqrt{s}=7\,\text {TeV} \) [50] for the energy flow \(\mathrm{d}E/ \mathrm{d}\eta \), in MB \(\mathrm {p}\mathrm {p}\) collisions. The data are compared to pythia6 using CUETP\(6\)S\(1\)-CTEQ\(6\)L1, and to pythia8 using CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\). The bottom panels of each plot show the ratios of these predictions to the data, and the green bands around unity represent the total experimental uncertainty

Fig. 13
figure 13

Combined CMS and TOTEM data at \(\sqrt{s}=8\,\text {TeV} \) [50] for the charged-particle distribution \(\mathrm{d}{\mathrm {N}}_{\text {ch}}/ \mathrm{d}\eta \), in inclusive inelastic (top left), NSD-enhanced (top right), and SD-enhanced (bottom) pp collisions. The data are compared to pythia6 using CUETP\(6\)S\(1\)-CTEQ\(6\)L1, and to pythia8 using CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\). The bottom panels of each plot show the ratios of these predictions to the data, and the green bands around unity represent the total experimental uncertainty

The pythia8 event generator using the UE tunes describes the MB data better than pythia6 with the UE tune, which is likely due to the improved modelling of single-diffraction dissociation, central-diffraction, and double-diffraction dissociation in pythia8. Predictions with all the UE tunes describe fairly well MB observables in the central region (\(|\eta |<2\)), however, only predictions obtained with CUETP\(8\)M\(1\) describe the data in the forward region (\(|\eta |>4\)). This is due to the PDF used in CUETP\(8\)M\(1\). As can be seen in Fig. 14, the NNPDF2.3LO PDF at scales \(Q^2\) = 10 GeV\(^2\) (corresponding to hard scatterings with \(\hat{p}_\mathrm{T}\) \(\sim \) 3 GeV) and small x, features a larger gluon density than in CTEQ6L1 and HERAPDF\(1.5\)LO, thereby contributing to more particles (and more energy) produced in the forward region. We have checked that increasing the gluon distribution in HERAPDF\(1.5\)LO at values below 10\(^{-5}\) improved the description of the charged-particle multiplicity measurements in the forward region.

Fig. 14
figure 14

Comparison of gluon distributions in the proton for the CTEQ6L1, HERAPDF1.5LO, and NNPDF2.3LO PDF sets, at the \(Q^2\) = 10 GeV\(^2\) (left) and 100 GeV\(^2\) (right)

4.4 Comparisons with inclusive jet production

In Fig. 15 predictions using CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\), and CUETHppS1 are compared to inclusive jet cross section at \(\sqrt{s}=7\,\text {TeV} \) [51] in several rapidity ranges. Predictions using CUETP\(8\)M\(1\) describe the data best, however, all the tunes overshoot the jet spectra at small \(p_{\mathrm {T}}\). Predictions from the CUETHppS1 underestimate the high \(p_\mathrm{T}\) region at central rapidity (|y| \(<\) 2.0). In Fig. 16, the inclusive jet cross sections are compared to predictions from powheg interfaced to pythia8 using CUETP\(8\)S\(1\)-HERAPDF1.5LO and CUETP\(8\)M\(1\). A very good description of the measurement is obtained.

Fig. 15
figure 15

CMS data at \(\sqrt{s}=7\,\text {TeV} \) [51] for the inclusive jet cross section as a function of \(p_{\mathrm {T}}\) in different rapidity ranges compared to predictions of pythia8 using CUETP8S1-CTEQ6L1, CUETP8S1-HERAPDF, and CUETP8M1, and of herwig++ using CUETHppS1. The bottom panels of each plot show the ratios of these predictions to the data, and the green bands around unity represent the total experimental uncertainty

Fig. 16
figure 16

CMS data at \(\sqrt{s}=7\,\text {TeV} \) [51] for the inclusive jet cross section as a function of \(p_{\mathrm {T}}\) in different rapidity ranges compared to predictions of powheg interfaced to pythia8 using CUETP\(8\)S\(1\)-HERAPDF1.5LO and CUETP\(8\)M\(1\). The bottom panels of each plot show the ratios of these predictions to the data, and the green bands around unity represent the total experimental uncertainty

4.5 Comparisons with Z boson production

In Fig. 17 the \(p_{\mathrm {T}}\) and rapidity distributions of the \(\mathrm{Z} \) boson in pp collisions at \(\sqrt{s}=7\,\text {TeV} \) [52] are shown and compared to pythia8 using CUETP\(8\)M\(1\), and to powheg interfaced to pythia8 using CUETP\(8\)S\(1\)-CTEQ\(6\)L1 and CUETP\(8\)M\(1\). The prediction using pythia8 with CUETP\(8\)M\(1\) (without powheg) agrees reasonably well with the distribution of the \(\mathrm{Z} \) boson at small \(p_{\mathrm {T}}\) values. Also, when interfaced to powheg, which implements an inclusive \(\mathrm{Z} \) boson NLO calculation, the agreement is good over the whole spectrum.

Fig. 17
figure 17

Transverse momentum \(p_{\mathrm {T}}\) (left) and rapidity distributions (right) of \(\mathrm{Z} \) boson production in pp collisions at \(\sqrt{s}=7\,\text {TeV} \) [52]. The data are compared to pythia8 using CUETP\(8\)M\(1\), and to powheg interfaced to pythia8 using CUETP\(8\)S\(1\)-CTEQ\(6\)L1 and CUETP\(8\)M\(1\). The green bands in the ratios represent the total experimental uncertainty

In Fig. 18 the charged-particle and \(p_\mathrm{T}^\mathrm{sum}\) densities [26] in the toward, away, and transverse (TransAVE) regions as defined by the \(\mathrm{Z} \) boson in proton–proton collisions at \(\sqrt{s}=7\,\text {TeV} \) are compared to predictions of pythia8 using CUETP\(8\)M\(1\). Also shown are MadGraph and powheg results interfaced to pythia8 using CUETP\(8\)S\(1\)-HERAPDF1.5LO and CUETP\(8\)M\(1\). The MadGraph generator simulates Drell–Yan events with up to four partons, using the CTEQ6L1 PDF. The matching of ME partons and PS is performed at a scale of 20 GeV. The powheg events are obtained using NLO inclusive Drell–Yan production, including up to one additional parton. The powheg events are interfaced to pythia8 using CUETP\(8\)M\(1\) and CUETP\(8\)S\(1\)-HERAPDF1.5LO. The predictions based on CUETP\(8\)M\(1\) do not fit the \(\mathrm{Z} \) boson data unless they are interfaced to a higher-order ME generator. In pythia8 only the Born term (\( \mathrm{q} \overline{\mathrm{q}} \rightarrow \mathrm{Z} \)), corrected for single-parton emission, is generated. This ME configuration agrees well with the observables in the away region in data, when the \(\mathrm{Z} \) boson recoils against one or more jets. In the transverse and toward regions, larger discrepancies between data and pythia8 predictions appear at high \(p_{\mathrm {T}}\), where the occurrence of multijet emission has a large impact. To describe \(\mathrm{Z} \) boson production at \(\sqrt{s}=7\,\text {TeV} \) in all regions, higher-order contributions (starting with \(\mathrm{Z} \)+2-jets), as used in interfacing pythia to powheg or MadGraph, must be included.

Fig. 18
figure 18

Charged-particle (left) and \(p_\mathrm{T}^\mathrm{sum}\) densities (right) in the toward (top), away (middle), and transverse (TransAVE) (bottom) regions, as defined by the Z-boson direction in Drell–Yan production at \(\sqrt{s}=7\,\text {TeV} \) [26]. The data are compared to pythia8 using CUETP\(8\)M\(1\), to MadGraph (MG) interfaced to pythia8 using CUETP\(8\)S\(1\)-CTEQ\(6\)L1 and CUETP\(8\)M\(1\), and to powheg (PH) interfaced to pythia8 using CUETP\(8\)S\(1\)-HERAPDF1.5LO and CUETP\(8\)M\(1\). The green bands in the ratios represent the total experimental uncertainty

5 Extrapolation to 13 TeV

In this section, predictions at \(\sqrt{s}=13\,\text {TeV} \), based on the new tunes, for observables sensitive to the UE are presented. Figure 19 shows the predictions at 13 TeV for the charged-particle and the \(p_\mathrm{T}^\mathrm{sum}\) densities in the TransMIN, TransMAX, and TransDIF regions, as defined by the leading charged particle as a function of \(p_\mathrm{T}^\mathrm{max}\) based on the five new CMS UE tunes: CUETP\(6\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, CUETP\(8\)M\(1\), and CUETHppS\(1\). In Fig. 19 the ratio of the predictions using the four CMS tunes to the one using CUETP\(8\)M\(1\) is shown. The predictions at \(13\,\text {TeV} \) of all these tunes are remarkably similar. It does not seem to matter that the new CMS pythia8 UE tunes do not fit very well to the \(\sqrt{s}=300\,\text {GeV} \) UE data. The new pythia8 tunes give results at \(13\,\text {TeV} \) similar to the new CMS pythia6 tune and the new CMS herwig++ tune. The uncertainties on the predictions based on the eigentunes do not exceed 10 % relative to the central value.

Fig. 19
figure 19

Predictions at \(\sqrt{s}=13\,\text {TeV} \) for the particle (left) and the \(p_\mathrm{T}^\mathrm{sum}\) densities (right) for charged particles with \(p_\mathrm{T}\!>\!0.5\,\text {GeV} \) and \(|\eta |\!<\!0.8\) in the TransMIN (top), TransMAX (middle), and TransDIF (bottom) regions, as defined by the leading charged particle, as a function of the leading charged-particle \(p_\mathrm{T}^\mathrm{max}\) for the five CMS UE tunes: pythia6 CUETP\(6\)S\(1\)-CTEQ\(6\)L1, and pythia8 CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\), and herwig++ CUETHppS\(1\). Also shown are the ratio of the tunes to predictions of CUETP\(8\)S\(1\)-CTEQ\(6\)L1. Predictions for CUETP\(8\)M\(1\) are shown along with the envelope (green bands) of the corresponding eigentunes

In Figs. 20 and 21 the predictions at \(\sqrt{s}=13\,\text {TeV} \) obtained using the new tunes from \(7\,\text {TeV} \) are shown for the charged-particle and the \(p_\mathrm{T}^\mathrm{sum}\) densities in the TransMIN, TransMAX, and TransDIF regions, defined as a function of \(p_\mathrm{T}^\mathrm{max}\). Also shown is the ratio of \(13\,\text {TeV} \) to \(7\,\text {TeV} \) results for the five tunes. The TransMIN region increases much more rapidly with energy than the TransDIF region. For example, when using CUETP\(8\)M\(1\), the charged-particle and the \(p_\mathrm{T}^\mathrm{sum}\) densities in the TransMIN region for \(5.0<\) \(p_\mathrm{T}^\mathrm{max}\) \(<6.0\,\text {GeV} \) is predicted to increase by 28 and \(37~\%\), respectively, while the TransDIF region is predicted to increase by a factor of two less, i.e. by 13 and \(18~\%\) respectively.

Fig. 20
figure 20

Charged-particle density at \(\sqrt{s}=7\,\text {TeV} \) for particles with \(p_\mathrm{T}\!>\!0.5\,\text {GeV} \) and \(|\eta |\!<\!0.8\) in the TransMIN (top), TransMAX (middle), and TransDIF (bottom) regions, as defined by the leading charged particle, as a function of the leading charged-particle \(p_\mathrm{T}^\mathrm{max}\). The data are compared to pythia6 using CUETP\(6\)S\(1\)-CTEQ\(6\)L1, to pythia8 using CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\), and to herwig++ using CUETHppS\(1\). Also shown are the predictions (left) based on the CMS UE tunes at \(13\,\text {TeV} \) (dashed lines), and the ratio of the \(13\,\text {TeV} \) to \(7\,\text {TeV} \) results for the five tunes (right)

Fig. 21
figure 21

Charged \(p_\mathrm{T}^\mathrm{sum}\) density at \(\sqrt{s}=7\,\text {TeV} \) for particles with \(p_\mathrm{T}\!>\!0.5\,\text {GeV} \) and \(|\eta |\!<\!0.8\) in the TransMIN (top), TransMAX (middle), and TransDIF (bottom) regions, as defined by the leading charged particle, as a function of the leading charged-particle \(p_\mathrm{T}^\mathrm{max}\). The data are compared to pythia6 using CUETP\(6\)S\(1\)-CTEQ\(6\)L1, to pythia8 using CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)S\(1\)-HERAPDF1.5LO, and CUETP\(8\)M\(1\), and to herwig++ using CUETHppS\(1\). Also shown are the predictions (left) based on the CMS UE tunes at \(13\,\text {TeV} \) (dashed lines), and the ratio of the \(13\,\text {TeV} \) to \(7\,\text {TeV} \) results for the five tunes (right)

In Fig. 22, predictions obtained with pythia8 using CUETP\(8\)S\(1\)-CTEQ\(6\)L1 and CUETP\(8\)M\(1\), and Tune \(4\)C are compared to the recent CMS data measured at \(\sqrt{s} = 13\,\text {TeV} \) [53] on charged-particle multiplicity as a function of pseudorapidity. Predictions from CUETP\(8\)S\(1\)-CTEQ\(6\)L1 and CUETP\(8\)M\(1\) are shown with the error bands corresponding to the uncertainties obtained from the eigentunes. These two new CMS tunes, although obtained from fits to UE data at 7\(\,\text {TeV}\), agree well with the MB measurements over the whole pseudorapidity range, while predictions from pythia8 Tune \(4\)C overestimate the data by about 10 %. This confirms that the collision-energy dependence of the CMS UE tunes parameters can be trusted for predictions of MB observables.

Fig. 22
figure 22

CMS data at \(\sqrt{s}=13\,\text {TeV} \) [53] for the charged-particle pseudorapidity distribution, \(\mathrm{d}{\mathrm {N}}_{\text {ch}}/\mathrm{d}\eta \), in inelastic proton–proton collisions. The data are compared to predictions of pythia8 using CUETP\(8\)S\(1\)-CTEQ\(6\)L1, CUETP\(8\)M\(1\), and Tune \(4\)C. The predictions based on CUETP\(8\)S\(1\)-CTEQ\(6\)L1 and CUETP\(8\)M\(1\) are shown with an error band corresponding to the total uncertainty obtained from the eigentunes. Also shown are the ratios of these predictions to the data. The green band represents the total experimental uncertainty on the data

6 Summary and conclusions

New tunes of the pythia event generator were constructed for different parton distribution functions using various sets of underlying-event (UE) data. By simultaneously fitting UE data at several center-of-mass energies, models for UE have been tested and their parameters constrained. The improvement in the description of UE data provided by the new CMS tunes at different collision energies gives confidence that they can provide reliable predictions at \(\sqrt{s} = 13\,\text {TeV} \), where all the new UE tunes predict similar results for the UE observables.

The observables sensitive to double-parton scattering (DPS) were fitted directly by tuning the MPI parameters. Two \(\mathrm {W}\)+dijet DPS tunes and two four-jet DPS tunes were constructed to study the dependence of the DPS-sensitive observables on the MPI parameters. The CMS UE tunes perform fairly well in the description of DPS observables, but they do not fit the DPS data as well as the DPS tunes do. On the other hand, the CMS DPS tunes do not fit the UE data as well as the UE tunes. At present, it is not possible to accurately describe both soft and hard MPI within the current pythia and herwig++ frameworks. Fitting DPS-sensitive observables has also provided the DPS effective cross section \(\sigma _\mathrm{eff}\) associated to each model. This method can be applied to determine the \(\sigma _\mathrm{eff}\) values associated with different MPI models implemented in the current MC event generators for the production of any final-state with two hard particles.

Predictions of pythia8 using the CMS UE tunes agree fairly well with the MB observables in the central region (\(|\eta |<2\)) and can be interfaced to higher-order and multileg matrix-element generators, such as powheg and MadGraph, while maintaining their good description of the UE. It is not necessary to produce separate tunes for these generators. In addition, we have verified that the measured particle pseudorapidity density at 13\(\,\text {TeV}\) is well reproduced by the new CMS UE Tunes. Furthermore, all of the new CMS tunes come with their eigentunes, which can be used to determine the uncertainties associated with the theoretical predictions. These new CMS tunes will play an important role in predicting and analyzing LHC data at 13 and \(14\,\text {TeV} \).