1 Introduction

The pairwise production of hadronic jets is one of the fundamental processes studied at hadron colliders. Dijet events with large transverse momenta can be described by parton-parton scattering in the context of quantum chromodynamics (QCD). Measurements of dijet cross sections can be used to thoroughly test predictions of perturbative QCD (pQCD) at high energies and to constrain parton distribution functions (PDFs). Previous measurements of dijet cross sections in proton-(anti)proton collisions have been performed as a function of dijet mass at the Sp\(\bar{\text {p}}\)S, ISR, and Tevatron colliders [1,2,3,4,5,6]. At the CERN LHC, dijet measurements as a function of dijet mass are reported in Refs. [7,8,9,10,11]. Also, dijet events have been studied triple-differentially in transverse energy and pseudorapidities \(\eta _1\) and \(\eta _2\) of the two leading jets [12, 13].

In this paper, a measurement of the triple-differential dijet cross section is presented as a function of the average transverse momentum \(p_{\mathrm {T,avg}} = (p_{\mathrm {T},1} + p_{\mathrm {T},2}) / 2\) of the two leading jets, half of their rapidity separation \(y^{*} = |y_1 - y_2| / 2\), and the boost of the dijet system \(y_{\mathrm {b}} = |y_1 + y_2| / 2\). The dijet event topologies are illustrated in Fig. 1.

Fig. 1
figure 1

Illustration of the dijet event topologies in the \(y^{*}\) and \(y_{\mathrm {b}}\) kinematic plane. The dijet system can be classified as a same-side or opposite-side jet event according to the boost \(y_{\mathrm {b}}\) of the two leading jets, thereby providing insight into the parton kinematics

The relation between the dijet rapidities and the parton momentum fractions \(x_{1,2}\) of the incoming protons at leading order (LO) is given by \(x_{1,2} = \frac{p_{\mathrm {T}}}{\sqrt{s}} ( e^{\pm y_1} + e^{\pm y_2})\), where \(p_{\mathrm {T}} = p_{\mathrm {T},1} = p_{\mathrm {T},2}\). For large values of \(y_{\mathrm {b}}\), the momentum fractions carried by the incoming partons must correspond to one large and one small value, while for small \(y_{\mathrm {b}}\) the momentum fractions must be approximately equal. In addition, for high transverse momenta of the jets, x values are probed above 0.1, where the proton PDFs are less precisely known.

The decomposition of the dijet cross section into the contributing partonic subprocesses is shown in Fig. 2 at next-to-leading order (NLO) accuracy, obtained using the NLOJet++ program version 4.1.3 [14, 15]. At small \(y_{\mathrm {b}}\) and large \(p_{\mathrm {T,avg}}\) a significant portion of the cross section corresponds to quark-quark (and small amounts of antiquark-antiquark) scattering with varying shares of equal- or unequal-type quarks. In contrast, for large \(y_{\mathrm {b}}\) more than 80% of the cross section corresponds to partonic subprocesses with at least one gluon participating in the interaction. As a consequence, new information about the PDFs can be derived from the measurement of the triple-differential dijet cross section.

The data were collected with the CMS detector at \(\sqrt{s} = 8\,\text {TeV} \) and correspond to an integrated luminosity of 19.7\(\,\text {fb}^\text {-1}\). The measured cross section is corrected for detector effects and is compared to NLO calculations in pQCD, complemented with electroweak (EW) and nonperturbative (NP) corrections. Furthermore, constraints on the PDFs are studied and the strong coupling constant \(\alpha _S (M_\mathrm {Z})\) is inferred.

Fig. 2
figure 2

Relative contributions of all subprocesses to the total cross section at NLO as a function of \(p_{\mathrm {T,avg}}\) in the various \(y^{*}\) and \(y_{\mathrm {b}}\) bins. The subprocess contributions are grouped into seven categories according to the type of the incoming partons. The calculations have been performed with NLOJet++. The notation implies the sum over initial-state parton flavors as well as interchanged quarks and antiquarks

2 The CMS detector

The central feature of the CMS apparatus is a superconducting solenoid of 6\(\text {\,m}\) internal diameter, providing a magnetic field of 3.8\(\text {\,T}\). Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. The silicon tracker measures charged particles within the pseudorapidity range \(|\eta | < 2.5\). It consists of 1440 silicon pixel and 15 148 silicon strip detector modules. The ECAL consists of 75 848 lead tungstate crystals, which provide coverage in pseudorapidity \(|\eta | < 1.48\) in a barrel region and \(1.48< |\eta | < 3.0\) in two endcap regions. In the region \(|\eta | < 1.74\), the HCAL cells have widths of 0.087 in pseudorapidity and 0.087 in azimuth (\(\phi \)). In the \(\eta \)-\(\phi \) plane, and for \(|\eta | < 1.48\), the HCAL cells map on to \(5\times {}5\) arrays of ECAL crystals to form calorimeter towers projecting radially outwards from close to the nominal interaction point. For \(|\eta | > 1.74\), the coverage of the towers increases progressively to a maximum of 0.174 in \(\varDelta \eta \) and \(\varDelta \phi \). Within each tower, the energy deposits in ECAL and HCAL cells are summed to define the calorimeter tower energies, subsequently used to provide the energies and directions of hadronic jets. The forward hadron (HF) calorimeter extends the pseudorapidity coverage provided by the barrel and endcap detectors and uses steel as an absorber and quartz fibers as the sensitive material. The two halves of the HF are located 11.2\(\text {\,m}\) from the interaction region, one on each end, and together they provide coverage in the range \(3.0< |\eta | < 5.2\). Muons are measured in gas-ionisation detectors embedded in the steel flux-return yoke outside the solenoid.

A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [16].

3 Event reconstruction and selection

Dijet events are collected using five single-jet high-level triggers [17, 18], which require at least one jet with \(p_{\mathrm {T}}\) larger than 80, 140, 200, 260, and 320\(\,\text {GeV}\), respectively. At trigger level the jets are reconstructed with a simplified version of the particle-flow (PF) event reconstruction described in the following paragraph. All but the highest threshold trigger were prescaled in the 2012 LHC run. The triggers are employed in mutually exclusive regions of the \(p_{\mathrm {T,avg}}\) spectrum, cf. Table 1, in which their efficiency exceeds 99%.

Table 1 List of single-jet trigger thresholds used in the analysis

The PF event algorithm reconstructs and identifies particle candidates with an optimised combination of information from the various elements of the CMS detector [19]. The energy of photons is directly obtained from the ECAL measurement, corrected for zero-suppression effects. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. The energy of muons is obtained from the curvature of the corresponding track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for zero-suppression effects and for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies. The leading primary vertex (PV) is chosen as the one with the highest sum of squares of all associated track transverse momenta. The remaining vertices are classified as pileup vertices, which result from additional proton-proton collisions. To reduce the background caused by such additional collisions, charged hadrons within the coverage of the tracker, \(|\eta | < 2.5\) [20], that unambiguously originate from a pileup vertex are removed.

Hadronic jets are clustered from the reconstructed particles with the infrared- and collinear-safe anti-\(k_{\mathrm {T}}\) algorithm [21] with a jet size parameter R of 0.7, which is the default for CMS jet measurements. The jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found in the simulation to be within 5–10% of the true momentum over the whole \(p_{\mathrm {T}}\) range. Jet energy corrections (JEC) are derived from the simulation, and are confirmed with in situ measurements of the energy balance of dijet, photon+jet, and Z boson+jet events [22, 23]. After applying the usual jet energy corrections, a small bias in the reconstructed pseudorapidity of the jets is observed at the edge of the tracker. An additional correction removes this effect.

All events are required to have at least one PV that must be reconstructed from four or more tracks. The longitudinal and transverse distances of the PV to the nominal interaction point of CMS must satisfy \(|z_\mathrm {PV}| < 24 \,\text {cm} \) and \(\rho _\mathrm {PV} < 2 \,\text {cm} \), respectively. Nonphysical jets are removed by loose jet identification criteria: each jet must contain at least two PF candidates, one of which is a charged hadron, and the jet energy fraction carried by neutral hadrons and photons must be less than 99%. These criteria remove less than 1% of genuine jets.

Only events with at least two jets up to an absolute rapidity of \(|y|=5.0\) are selected and the two jets leading in \(p_{\mathrm {T}}\) are required to have transverse momenta greater than 50\(\,\text {GeV}\) and \(|y| < 3.0\). The missing transverse momentum is defined as the negative vector sum of the transverse momenta of all PF candidates in the event. Its magnitude is referred to as \(p_{\mathrm {T}} ^\text {miss}\). For consistency with previous jet measurements by CMS, \(p_{\mathrm {T}} ^\text {miss}\) is required to be smaller than 30% of the scalar sum of the transverse momenta of all PF candidates. For dijet events, which exhibit very little \(p_{\mathrm {T}}\) imbalance, the impact is practically negligible.

4 Measurement of the triple-differential dijet cross section

The triple-differential cross section for dijet production is defined as

$$\begin{aligned} \frac{\mathrm {d}^3 \sigma }{\mathrm {d}p_{\mathrm {T,avg}} \mathrm {d}y^{*} \mathrm {d}y_{\mathrm {b}}} = \frac{1}{\epsilon \mathcal {L}_{\mathrm {int}}^\mathrm {eff}} \frac{N}{\varDelta p_{\mathrm {T,avg}} \varDelta y^{*} \varDelta y_{\mathrm {b}}}, \end{aligned}$$

where N denotes the number of dijet events within a given bin, \(\mathcal {L}_{\mathrm {int}}^{\mathrm {eff}}\) the effective integrated luminosity, and \(\epsilon \) the product of trigger and event selection efficiencies, which are greater than 99% in the phase space of the measurement. Contributions from background processes, such as \(\mathrm {t}\overline{\mathrm {t}}\) production, are several orders of magnitude smaller and are neglected. The bin widths are \(\varDelta p_{\mathrm {T,avg}} \), \(\varDelta y^{*} \), and \(\varDelta y_{\mathrm {b}} \).

The cross section is unfolded to the stable-particle level (lifetime \(c\tau > 1\,\text {cm} \)) to correct for detector resolution effects. The iterative D’Agostini algorithm with early stopping [24,25,26], as implemented in the RooUnfold package [27], is employed for the unfolding. The response matrix, which relates the particle-level distribution to the measured distribution at detector level, is derived using a forward smearing technique. An NLOJet++ prediction, obtained with CT14 PDFs [28] and corrected for NP and EW effects, is approximated by a continuous function to represent the distribution at particle level. Subsequently, pseudoevents are distributed uniformly in \(p_{\mathrm {T,avg}}\) and weighted according to the theoretical prediction. These weighted events are smeared using the jet \(p_{\mathrm {T}}\) resolution to yield a response matrix and a prediction at detector level. By using large numbers of such pseudoevents, statistical fluctuations in the response matrix are strongly suppressed.

The jet energy (or \(p_{\mathrm {T}} \)) resolution (JER) is determined from the CMS detector simulation based on the Geant4 toolkit [29] and the pythia  6.4 Monte Carlo (MC) event generator [30] and is corrected for residual differences between data and simulation following Ref. [23]. The rapidity dependence of both the JER from simulation and of the residual differences have been taken into account. The Gaussian \(p_{\mathrm {T}}\) resolution in the interval \(|y|<1\) is about 8% at 100\(\,\text {GeV}\) and improves to 5% at 1\(\,\text {TeV}\). Non-Gaussian tails in the JER, exhibited for jet rapidities close to \(|y|=3\), are included in a corresponding uncertainty.

The regularisation strength of the iterative unfolding procedure is defined through the number of iterations, whose optimal value is determined by performing a \(\chi ^2\) test between the original measured data and the unfolded data after smearing with the response matrix. The values obtained for \(\chi ^2\) per number of degrees of freedom, \(n_{\mathrm {dof}}\), in these comparisons approach unity in four iterations and thereafter decrease slowly for additional iterations. The optimal number of iterations is therefore determined to be four. The procedure is in agreement with the criteria of Ref. [31]. The response matrices derived in this manner for each bin in \(y^{*}\) and \(y_{\mathrm {b}}\) are nearly diagonal. A cross check using the pythia 6 MC event generator as theory and the detector simulation to construct the response matrices revealed no discrepancies compared to the baseline result.

Migrations into and out of the accepted phase space in \(y^{*}\) and \(y_{\mathrm {b}}\) or between bins happen only at a level below 5%. The net effect of these migrations has been included in the respective response matrices and has been cross checked successfully using a 3-dimensional unfolding.

As a consequence of these migrations, small statistical correlations between neighbouring bins of the unfolded cross sections are introduced during the unfolding procedure. The statistical uncertainties after being propagated through the unfolding are smaller than 1% in the majority of the phase space, and amount up to 20% for highest \(p_{\mathrm {T,avg}}\).

The dominant systematic uncertainties in the cross section measurement arise from uncertainties in the JEC. Summing up quadratically all JEC uncertainties according to the prescription given in Ref. [23], the total JEC uncertainty amounts to about 2.5% in the central region and increases to 12% in the forward regions. The 2.6% uncertainty in the integrated luminosity [32] is directly propagated to the cross section. The uncertainty in the JER enters the measurement through the unfolding procedure and results in an additional uncertainty of 1–2% of the unfolded cross section. Non-Gaussian tails in the detector response to jets near \(|y| = 3.0\), the maximal absolute rapidity considered in this measurement, are responsible for an additional uncertainty of up to 2%. Residual effects of small inefficiencies in the jet identification and trigger selection are covered by an uncorrelated uncertainty of 1% [11]. The total systematic experimental uncertainty ranges from about 3–8% in the central detector region and up to 12% for absolute rapidities near the selection limit of 3.0. Figure 3 depicts all experimental uncertainties as well as the total uncertainty, which is calculated as the quadratic sum of all the contributions from the individual sources.

Fig. 3
figure 3

Overview of all experimental uncertainties affecting the cross section measurement in six bins of \(y_{\mathrm {b}}\) and \(y^{*}\). The error bars indicate the statistical uncertainty after unfolding. The different lines show the uncertainties resulting from jet energy corrections, jet energy resolution, integrated luminosity, non-Gaussian tails in the resolution, and from residual effects included in the uncorrelated uncertainty. The total uncertainty is obtained by adding all uncertainties in quadrature

5 Theoretical predictions

The NLO predictions for the triple-differential dijet cross section are calculated using NLOJet++ within the framework of fastNLO version 2.1 [33, 34]. The renormalisation and factorisation scales \(\mu _\text {r}\) and \(\mu _\text {f}\) are both set to \(\mu =\mu _0 =p_{\mathrm {T,max}} \cdot e^{0.3 y^{*}}\), a scale choice first investigated in Ref. [35]. The variation of these scales by constant factors as described below is conventionally used to estimate the effect of missing higher orders. The scale uncertainty is reduced in regions with large values of \(y_{\mathrm {b}}\) with the above-mentioned choice for \(\mu _0\) compared to a prediction with \(\mu _0 =p_{\mathrm {T,avg}}\). The predictions for cross sections obtained with different central scale choices are compatible within the scale uncertainties. The calculation is performed using the PDF sets CT14, ABM11 [36], MMHT2014 [37], and NNPDF 3.0 [38] at next-to-leading evolution order which are accessed via the LHAPDF  6.1.6 interface [39, 40] using the respective values of \(\alpha _S (M_\mathrm {Z})\) and the supplied \(\alpha _S \) evolution. The size of the NLO correction is shown in Fig. 4 top left and varies between \(+10\)% and \(+30\)% at high \(p_{\mathrm {T,avg}}\) and low \(y_{\mathrm {b}}\).

The fixed-order calculations are accompanied by NP corrections, \(c_k^\mathrm {NP}\), derived from the LO MC event generators pythia  8.185 [41] and herwig++  2.7.0 [42] with the tunes CUETP8M1 [43] and UE-EE-5C [44], respectively, and the NLO MC generator powheg  [45,46,47,48] in combination with pythia 8 and the tunes CUETP8M1 and CUETP8S1 [43].

The correction factor \(c_{k}^{\mathrm {NP}}\) is defined as the ratio between the nominal cross section with and without multiple parton interactions (MPI) and hadronisation (HAD) effects

$$\begin{aligned} c_{k}^{\mathrm {NP}} = \frac{\sigma _{k}^{\mathrm {PS+HAD+MPI}}}{\sigma _{k}^{\mathrm {PS}}}\,, \end{aligned}$$

where the superscript indicates the steps in the simulation: the parton shower (PS), the MPI, and the hadronisation. The corresponding correction factor, as displayed in Fig. 4 bottom, is applied in each bin k to the parton-level NLO cross section. It differs from unity by about \(+10\)% for lowest \(p_{\mathrm {T,avg}}\) and becomes negligible above 1\(\,\text {TeV}\).

To account for differences among the correction factors obtained by using herwig++, pythia 8, and powheg+pythia 8, half of the envelope of all these predictions is taken as the uncertainty and the centre of the envelope is used as the central correction factor.

The contribution from EW effects, which arise mainly from virtual exchanges of massive W and Z bosons, is relevant at high jet \(p_{\mathrm {T}}\) and central rapidities [49, 50]. These corrections, shown in Fig. 4 top right, are smaller than 3% below 1\(\,\text {TeV}\) and reach 8% for the highest \(p_{\mathrm {T,avg}}\). Theoretical uncertainties in this correction due to its renormalisation scheme and indirect PDF dependence are considered to be negligible.

Fig. 4
figure 4

Overview of the theoretical correction factors. For each of the six analysis bins the NLO QCD (top left), the electroweak (top right), and the NP correction factor (bottom) are shown as a function of \(p_{\mathrm {T,avg}}\). The NLO QCD correction has been derived with the same NLO PDF in numerator and denominator and is included in the NLO prediction by NLOJet++

The total theoretical uncertainty is obtained as the quadratic sum of NP, scale, and PDF uncertainties. The scale uncertainties are calculated by varying \(\mu _\text {r}\) and \(\mu _\text {f}\) using multiplicative factors in the following six combinations: \((\mu _\text {r}/\mu _0, \mu _\text {f}/\mu _0) = (1/2, 1/2)\), (1 / 2, 1), (1, 1 / 2), (1, 2), (2, 1), and (2, 2). The uncertainty is determined as the maximal upwards and downwards variation with respect to the cross section obtained with the nominal scale setting [51, 52]. The PDF uncertainties are evaluated according to the NNPDF 3.0 prescription as the standard deviation from the average prediction. Figure 5 shows the relative size of the theoretical uncertainties for the phase-space regions studied. The scale uncertainty dominates in the low-\(p_{\mathrm {T,avg}}\) region. At high \(p_{\mathrm {T,avg}}\), and especially in the boosted region, the PDFs become the dominant source of uncertainty. In total, the theoretical uncertainty increases from about 2% at low \(p_{\mathrm {T,avg}}\) to at least 10% and up to more than 30% for the highest accessed transverse momenta and rapidities.

Fig. 5
figure 5

Overview of the theoretical uncertainties. The scale uncertainty dominates in the low-\(p_{\mathrm {T,avg}}\) region. At high \(p_{\mathrm {T,avg}}\), and especially in the boosted region, the PDFs become the dominant source of uncertainty

6 Results

The triple-differential dijet cross section is presented in Fig. 6 as a function of \(p_{\mathrm {T,avg}}\) for six phase-space regions in \(y^{*}\) and \(y_{\mathrm {b}}\). The theoretical predictions are found to be compatible with the unfolded cross section over a wide range of the investigated phase space.

Fig. 6
figure 6

The triple-differential dijet cross section in six bins of \(y^{*}\) and \(y_{\mathrm {b}}\). The data are indicated by different markers for each bin. The theoretical predictions, obtained with NLOJet++ and NNPDF 3.0, and complemented with EW and NP corrections, are depicted by solid lines. Apart from the boosted region, the data are well described by the predictions at NLO accuracy over many orders of magnitude

The ratios of the measured cross section to the theoretical predictions from various global PDF sets are shown in Fig. 7. The data are well described by the predictions using the CT14, MMHT 2014, and NNPDF 3.0 PDF sets in most of the analysed phase space. In the boosted regions (\(y_{\mathrm {b}} \ge 1\)) differences between data and predictions are observed at high \(p_{\mathrm {T,avg}}\), where the less known high-x region of the PDFs is probed. In this boosted dijet topology, the predictions exhibit large PDF uncertainties, as can be seen in Fig. 5. The significantly smaller uncertainties of the data in that region indicate their potential to constrain the PDFs.

Predictions using the ABM 11 PDFs systematically underestimate the data for \(y_{\mathrm {b}} <2.0\). This behavior has been observed previously [53] and can be traced back to a soft gluon PDF accompanied with a low value of \(\alpha _S (M_\mathrm {Z})\).

Figure 8 presents the ratios of the data to the predictions of the powheg+pythia 8 and herwig  7.0.3 [54] NLO MC event generators. Significant differences between the predictions from both MC event generators are observed. However, the scale definitions and the PDF sets are different. For powheg and herwig 7 the CT10 and MMHT 2014 PDF sets are used, respectively. In general, herwig 7 describes the data better in the central region whereas powheg prevails in the boosted region.

Fig. 7
figure 7

Ratio of the triple-differential dijet cross section to the NLOJet++ prediction using the NNPDF 3.0 set. The data points including statistical uncertainties are indicated by markers, the systematic experimental uncertainty is represented by the hatched band. The solid band shows the PDF, scale, and NP uncertainties quadratically added; the solid and dashed lines give the ratios calculated with the predictions for different PDF sets

Fig. 8
figure 8

Ratio of the triple-differential dijet cross section to the NLOJet++ prediction using the NNPDF 3.0 set. The data points including statistical uncertainties are indicated by markers, the systematic experimental uncertainty is represented by the hatched band. The solid band shows the PDF, scale, and NP uncertainties quadratically added. The predictions of the NLO MC event generators powheg+pythia 8 and herwig 7 are depicted by solid and dashed lines, respectively

7 PDF constraints and determination of the strong coupling constant

The constraints of the triple-differential dijet measurement on the proton PDFs are demonstrated by including the cross section in a PDF fit with inclusive measurements of deep-inelastic scattering (DIS) from the H1 and ZEUS experiments at the HERA collider [55]. The fit is performed with the open-source fitting framework xFitter version 1.2.2 [56]. The PDF evolution is based on the Dokshitzer–Gribov–Lipatov–Altarelli–Parisi (DGLAP) evolution equations [57,58,59] as implemented in the QCDNUM  17.01.12 package [60]. To ensure consistency between the HERA DIS and the dijet cross section calculations, the fits are performed at NLO.

The analysis is based on similar studies of inclusive jet data at 7\(\,\text {TeV}\)  [53] and 8\(\,\text {TeV}\)  [61] and all settings were chosen in accordance to the inclusive jet study at 8\(\,\text {TeV}\)  [61]. The parameterisation of the PDFs is defined at the starting scale \(Q_0^2 = 1.9\,\text {GeV} ^2 \). The five independent PDFs \(xu_v(x)\), \(xd_v(x)\), xg(x), \(x\overline{U}(x)\), and \(x\overline{D}(x)\) represent the u and d valence quarks, the gluon, and the up- and down-type sea quarks and are parameterised as follows:

$$\begin{aligned} xg(x)&= A_g x^{B_g} (1-x)^{C_g} - A_g' x^{B_g'}(1-x)^{C_g'}\,, \end{aligned}$$
(1)
$$\begin{aligned} xu_v(x)&= A_{u_{v}} x^{B_{u_{v}}} (1-x)^{C_{u_{v}}}(1 + D_{u_{v}}x+E_{u_{v}}x^2)\,,\end{aligned}$$
(2)
$$\begin{aligned} xd_v(x)&= A_{d_v} x^{B_{d_v}} (1-x)^{C_{d_{v}}}(1 + D_{d_{v}}x)\,,\end{aligned}$$
(3)
$$\begin{aligned} x\overline{U}(x)&= A_{\overline{U}} x^{B_{\overline{U}}} (1-x)^{C_{\overline{U}}}(1 + D_{\overline{U}}x)\,,\end{aligned}$$
(4)
$$\begin{aligned} x\overline{D}(x)&= A_{\overline{D}} x^{B_{\overline{D}}} (1-x)^{C_{\overline{D}}}\,, \end{aligned}$$
(5)

where \(x{\overline{U}}(x) = x{\overline{u}}(x)\), and \(x{\overline{D}}(x) = x{\overline{d}}(x) + x{\overline{s}}(x)\).

In these equations, the normalisation parameters \(A_g\), \(A_{u_{v}}\), and \(A_{d_{v}}\) are fixed using QCD sum rules. The constraints \(B_{\overline{U}}=B_{\overline{D}}\) and \(A_{\overline{U}} = A_{\overline{D}}(1-f_s)\) are imposed to ensure the same normalisation for the \(\overline{U}\) and \(\overline{D}\) PDF for the \(x \rightarrow 0\) region. The strange quark PDF is defined to be a fixed fraction \(f_s = 0.31\) of \(x\overline{D}(x)\). The generalised-mass variable-flavour number scheme as described in [62, 63] is used and the strong coupling constant is set to \(\alpha _S (M_\mathrm {Z}) = 0.1180\). The set of parameters in Eqs. (1)–(5) is chosen by first performing a fit where all D and E parameters are set to zero. Further parameters are included into this set one at a time. The improvement of \(\chi ^2\) of the fit is monitored and the procedure is stopped when no further improvement is observed. This leads to a 16-parameter fit. Due to differences in the sensitivity of the various PDFs to dijet and inclusive jet data, the parameterisation of the present analysis differs from that in Ref. [61]. In particular, the constraint \(B_{d_v}=B_{u_v}\) at the starting scale has been released. This results in a d valence quark distribution consistent with the results obtained in Ref. [61] and in a similar CMS analysis of muon charge asymmetry in W boson production at 8\(\,\text {TeV}\)  [64].

The PDF uncertainties are determined using the HERAPDF method [55, 56] with uncertainties subdivided into the three categories of experimental, model, and parameterisation uncertainty, which are evaluated separately and added in quadrature to obtain the total PDF uncertainty.

Experimental uncertainties originate from statistical and systematic uncertainties in the data and are propagated to the PDFs using the Hessian eigenvector method [65] and a tolerance criterion of \(\varDelta \chi ^2 = + 1\). Alternatively, the Monte Carlo method [66] is used to determine the PDF fit uncertainties and similar results are obtained.

The uncertainties in several input parameters in the PDF fits are combined into one model uncertainty. For the evaluation of the model uncertainties some variations on the input parameters are considered. The strangeness fraction is chosen in agreement with Ref. [67] to be \(f_s=0.31\) and is varied between 0.23 and 0.39. Following Ref. [55], the b quark mass, set to \(4.5 \,\text {GeV} \), is varied between 4.25 and \(4.75\,\text {GeV} \). Similarly, the c quark mass, set by default to \(1.47\,\text {GeV} \), is varied between 1.41 and \(1.53\,\text {GeV} \). The minimum \(Q^2\) imposed on the HERA DIS data is set in accordance with the CMS inclusive jet analysis described in [53] to \(Q^2_\mathrm {min}=7.5\,\text {GeV} ^2 \), and is varied between \(Q^2_\mathrm {min} = 5.0\,\text {GeV} ^2 \) and \(10.0\,\text {GeV} ^2 \).

The parameterisation uncertainty is estimated by including additional parameters in the fit, leading to a more flexible functional form of the PDFs. Each parameter is successively added in the PDF fit, and the envelope of all changes to the central PDF fit result is taken as parameterisation uncertainty. The increased flexibility of the PDFs while estimating the parameterisation uncertainty may lead to the seemingly paradoxical effect that, although new data are included, the total uncertainty can increase in regions, where direct constraints from data are absent. This may happen at very low or at very high x, where the PDF is determined through extrapolation alone. Furthermore, the variation of the starting scale \(Q_0^2\) to 1.6 and 2.2\(\,\text {GeV} ^2\) is considered in this parameterisation uncertainty.

The quality of the resulting PDF fit with and without the dijet measurement is reported in Table 2. The partial \(\chi ^2\) per data point for each data set as well as the \(\chi ^2/n_\text {dof}\) for all data sets demonstrate the compatibility of the CMS dijet measurement and the DIS data from the H1 and ZEUS experiments in a combined fit.

Table 2 The partial \(\chi ^2\) (\(\chi ^2_\text {p}\)) for each data set in the HERA DIS (middle section) or the combined fit including the CMS triple-differential dijet data (right section) are shown. The bottom two lines show the total \(\chi ^2\) and \(\chi ^2/n_\text {dof}\). The difference between the sum of all \(\chi ^2_\text {p}\) and the total \(\chi ^2\) for the combined fit is attributed to the nuisance parameters

The PDFs obtained for the gluon, u valence, d valence, and sea quarks are presented for a fit with and without the CMS dijet data in Fig. 9 for \(Q^2=10^4\,\text {GeV} ^2 \). The uncertainty in the gluon PDF is reduced over a large range in x with the largest impact in the high-x region, where some reduction in uncertainty can also be observed for the valence quark and the sea quark PDFs. For x values beyond \({\approx }\, 0.7\) or below \(10^{-3}\), the extracted PDFs are not directly constrained by data and should be considered as extrapolations that rely on PDF parameterisation assumptions alone.

The improvement in the uncertainty of the gluon PDF is accompanied by a noticeable change in shape, which is most visible when evolved to low scales as shown in Fig. 10. Compared to the fit with HERA DIS data alone, the gluon PDF shrinks at medium x and increases at high x. A similar effect has been observed before, e.g. in Ref. [53].

The PDFs are compared in Fig. 11 to those obtained with inclusive jet data at \(\sqrt{s} = 8\,\text {TeV} \) [61]. The shapes of the PDFs and the uncertainties are similar. Somewhat larger uncertainties in the valence quark distributions are observed in the fit using the dijet data with respect to those obtained from the inclusive jet cross section. This behaviour can be explained by a stronger sensitivity of the dijet data to the light quark distributions, resulting in an increased flexibility of the PDF parameterisation, however, at the cost of an increased uncertainty.

Fig. 9
figure 9

The gluon (top left), sea quark (top right), d valence quark (bottom left), and u valence quark (bottom right) PDFs as a function of x as derived from HERA inclusive DIS data alone (hatched band) and in combination with CMS dijet data (solid band). The PDFs are shown at the scale \(Q^2 = 10^{4}\,\text {GeV} ^2 \) with their total uncertainties

Fig. 10
figure 10

The gluon PDF as a function of x as derived from HERA inclusive DIS data alone (hatched band) and in combination with CMS dijet data (solid band). The PDF and its total uncertainty are shown at the starting scale \(Q^2 = 1.9\,\text {GeV} ^2 \) of the PDF evolution

Fig. 11
figure 11

The gluon (top left), sea quark (top right), d valence quark (bottom left), and u valence quark (bottom right) PDFs as a function of x as derived from a fit of HERA inclusive DIS data in combination with CMS inclusive jet data (solid band) and CMS dijet data (hatched band) at 8 TeV. The PDFs are shown at the scale \(Q^2 = 10^{4}\,\text {GeV} ^2 \) with their total uncertainties

The measurement of the triple-differential dijet cross section not only provides constraints on the PDFs, but also on the strong coupling constant. Therefore, the PDF fit is repeated with an additional free parameter: the strong coupling constant \(\alpha _S (M_\mathrm {Z})\). The value obtained for the strong coupling constant is

$$\begin{aligned} \alpha _S (M_\mathrm {Z}) = 0.1199\,\pm \,0.0015(\mathrm {exp})_{-0.0002}^{+0.0002}(\mathrm {mod})_{-0.0004}^{+0.0002}(\mathrm {par}), \end{aligned}$$

where the quoted experimental (exp) uncertainty accounts for all sources of uncertainties in the HERA and CMS data sets, as well as the NP uncertainties. The model (mod) and parameterisation (par) uncertainties are evaluated in the same way as in the PDF determination. The consideration of scale uncertainties in a global PDF fit is an open issue in the PDF community because it is unclear how to deal with the correlations in scale settings among the different measurements and observables. Therefore they are not taken into account in any global PDF fit up to now, although an elaborate study of the effect of scale settings on dijet cross sections has been performed in Ref. [68], which also reports first combined PDF and \(\alpha _S (M_\mathrm {Z})\) fits using LHC inclusive jet data. Following Ref. [53], where the final uncertainties and correlations of CMS inclusive jet data at 7\(\,\text {TeV}\) are used in such combined fits, two different methods to evaluate the scale uncertainty of the jet cross section on \(\alpha _S (M_\mathrm {Z})\) are studied. First, the renormalisation and factorisation scales are varied in the calculation of the dijet predictions. The fit is repeated for each variation. The uncertainty is evaluated as detailed in Sect. 5 and yields \(\varDelta \alpha _S (M_\mathrm {Z}) = ^{+0.0026}_{-0.0016}(\text {scale, refit})\).

The second procedure is analogous to the method applied by CMS in previous determinations of \(\alpha _S (M_\mathrm {Z})\) without simultaneous PDF fits, cf. Refs. [53, 61, 69, 70]. The PDFs are derived for a series of fixed values of \(\alpha _S (M_\mathrm {Z})\) and the nominal choice of \(\mu _\text {r}\) and \(\mu _\text {f}\). Using this series, the best fit \(\alpha _S (M_\mathrm {Z})\) value of the dijet data is determined for each scale variation. Here, the evaluated uncertainty is \(\varDelta \alpha _S (M_\mathrm {Z}) = ^{+0.0031}_{-0.0019}(\text {scale}, \alpha _S (M_\mathrm {Z}) \text {series})\).

Both results, \(\alpha _S (M_\mathrm {Z}) = 0.1199 ^{+0.0015}_{-0.0016}\) (all except scale) with \(^{+0.0026}_{-0.0016}\) (scale, refit) and \(^{+0.0031}_{-0.0019}\) (scale, \(\alpha _S (M_\mathrm {Z})\) series), are in agreement with Ref. [53], which reports \(\alpha _S (M_\mathrm {Z}) = 0.1192 ^{+0.0023}_{-0.0019}\) (all except scale) and \(^{+0.0022}_{-0.0009}\) (scale, refit) respectively \(^{+0.0024}_{-0.0039}\) (scale, \(\alpha _S (M_\mathrm {Z})\) series). Similarly, it is observed that the second procedure leads to somewhat larger scale uncertainties, because there is less freedom for compensating effects between different gluon distributions and the \(\alpha _S (M_\mathrm {Z})\) values. Since this latter uncertainty is the most consistent to be compared with previous fixed-PDF determinations of \(\alpha _S (M_\mathrm {Z})\), it is quoted as the main result. The dominant source of uncertainty is of theoretical origin and arises due to missing higher order corrections, whose effect is estimated by scale variations.

This value of \(\alpha _S (M_\mathrm {Z})\) is in agreement with the results from other measurements by CMS [53, 61, 69,70,71] and ATLAS [72], with the value obtained in a similar analysis complementing the DIS data of the HERAPDF2.0 fit with HERA jet data [55], and with the world average of \(\alpha _S (M_\mathrm {Z}) =0.1181\pm 0.0011\) [73]. In contrast to the other CMS results, this analysis is mainly focused on PDF constraints. The running of the strong coupling constant was tested only indirectly via the renormalisation group equations. No explicit test of the running was carried out by subdividing the phase space into regions corresponding to different values of the renormalisation scale.

8 Summary

A measurement of the triple-differential dijet cross section is presented for \(\sqrt{s}=8\,\text {TeV} \). The data are found to be well described by NLO predictions corrected for nonperturbative and electroweak effects, except for highly boosted event topologies that suffer from large uncertainties in parton distribution functions (PDFs).

The precise data constrain the PDFs, especially in the highly boosted regime that probes the highest fractions x of the proton momentum carried by a parton. The impact of the data on the PDFs is demonstrated by performing a simultaneous fit to cross sections of deep-inelastic scattering obtained by the HERA experiments and the dijet cross section measured in this analysis. When including the dijet data, an increased gluon PDF at high x is obtained and the overall uncertainties of the PDFs, especially those of the gluon distribution, are significantly reduced. In contrast to a fit that uses inclusive jet data, this measurement carries more information on the valence-quark content of the proton such that a more flexible parameterisation is needed to describe the low-x behaviour of the u and d valence quark PDFs. This higher sensitivity is accompanied by slightly larger uncertainties in the valence quark distributions as a consequence of the greater flexibility in the parameterisation of the PDFs.

In a simultaneous fit the strong coupling constant \(\alpha _S (M_\mathrm {Z})\) is extracted together with the PDFs. The value obtained at the mass of the Z boson is

$$\begin{aligned} \alpha _S (M_\mathrm {Z})&= 0.1199\, \pm {0.0015}\,(\mathrm {exp})\\&\quad \pm {0.0002}\,(\mathrm {mod}) \,{}_{-0.0004}^{+0.0002}\,(\mathrm {par})\, {}_{-0.0019}^{+0.0031}\,(\mathrm {scale})\\&= 0.1199\, \pm {0.0015}\,(\mathrm {exp})\, _{-0.0020}^{+0.0031}\,(\mathrm {theo}), \end{aligned}$$

and is in agreement with previous measurements at the LHC by CMS [53, 61, 69,70,71] and ATLAS [72], and with the world average value of \(\alpha _S (M_\mathrm {Z}) = 0.1181 \,\pm \, 0.0011\) [73]. The dominant uncertainty is theoretical in nature and is expected to be reduced significantly in the future using pQCD predictions at next-to-next-to-leading order [74].