1 Introduction

Experimentally, we can study partons (quarks and gluons) by analyzing jets (narrow, energetic sprays of particles) whose kinematic characteristics mirror those of an initiating parton that cannot be directly measured. By employing an appropriate jet definition, it becomes possible to establish a link between jet measurements obtained from clusters of hadrons and calculations performed on clusters of partons. In a more ambitious approach, it is conceivable to attempt jet tagging with a well-defined flavour label, thereby increasing the proportion of, for instance, gluon-tagged jets compared to quark-tagged jets. The capacity to differentiate quark jets from gluon jets on an event-by-event basis has the potential to considerably increase the scope and sensitivity of numerous new-physics studies at the Large Hadron Collider (LHC) [1,2,3,4,5,6]. This is because Beyond the Standard Model signals are often dominated by quarks while the corresponding Standard Model backgrounds are dominated by gluons [7, 8].

As well as proposing an observable that can distinguish quark jets and gluon jets [9,10,11,12,13,14,15,16,17,18,19,20], any quantitative analysis must also propose how to calibrate that observable by independently tagging quark and gluon jet samples. In some studies, this has been done by calibrating against Monte Carlo samples in which the “truth” flavour of the jet is known. However, one might worry about whether event generators make sufficiently reliable predictions of these flavour-dependent properties [21,22,23] and, indeed, this is something one would like to test against the data. In other studies, another method is used to tag the jet flavour, for example the hard process dependence [24, 25], and used to calibrate the measurement of the proposed observable. Here, one would worry that the two tagging methods are correlated, yielding a biased measurement of the jet property.

In this paper, we study a variety of angularity observables as measures of quark/gluon jet differences. The main new ingredient we propose is the calibration of those differences using the dependence on the centre-of mass energy of the Large Hadron Collider. The idea is that the properties of jets of a given flavour and transverse momentum, if suitably defined, are almost entirely independent of the jet’s production mechanism, i.e. its rapidity, the energy of the collision, the colliding beam types, parton distributions, etc. [26], but the fraction of jets of a given flavour at fixed transverse momentum does depend on all those factors, in particular the collision energy. However, those jet fractions can be reliably predicted. Thus, the energy-dependence can be used to extract the flavour-dependent properties on a statistical basis.

This paper is organised as follows: in Sect. 2, we present the measurement strategy; then, in Sect. 3, we discuss important systematic effects; in Sect. 4, we present measures that determine the quality of observables; in Sect. 5 we provide the main results, and finally in Sect. 6 we summarise the study.

2 Measurement strategy

The LHC has collected data at many different energies: 900 GeV, 2.36 TeV, 5.02 TeV, 7 TeV, 8 TeV and 13 TeV and will also take data at 14 TeV. There is great potential to significantly increase the research potential of LHC by constructing new experimental strategies that exploit this unique situation. A measurement strategy based on the flexibility of the LHC to run at variable beam energies has already been successfully studied in the case of measuring the mass of the W boson [27, 28] and the difference in the mass of the \(W^+\) and \(W^-\) bosons [29]. In these publications, this flexibility was shown to be helpful in defining observables that are insensitive to ambiguities in the modelling, as well as in minimising the impact of systematic errors in the W-boson mass measurement. This was achieved through the construction of observables that included the ratio of physical quantities measured at different energies. A second example of this type of measurement that we are aware of is [30], in which the authors used both the ratios of cross sections, and the ratios of cross-sectional ratios between different centre-of-mass energies at the LHC to study the possibilities for precise measurements and BSM sensitivity. Sadly, despite many advantages, the idea of using LHC data collected at different energies to construct new robust observables was not exploited almost at all at LHC. The aim of this study is to change the situation and use this unique opportunity to construct new observables that are sensitive to the differences between quark and gluon jets.

The Tevatron collider also ran at different energies: 630 GeV and 1.8 TeV. The Tevatron experiments, CDF and D0, exploited this to some extent for parton distribution function measurements, but to our knowledge, only one analysis of final-state jet properties that combined measurements at the two energies was published [31, 32]. This followed essentially the same method we will apply to LHC events below, to extract the distribution of subjets within quark and gluon jets.

In leading-order QCD, the fraction of final-state jets that are of gluon origin increases with decreasing

$$\begin{aligned} x\sim ~p_T/\sqrt{s}, \end{aligned}$$
(1)

where \(p_T\) is the transverse momentum of a jet, \(\sqrt{s}\) is the proton–proton collision energy, and x is the momentum fraction of the initial-state partons within the proton. This is mainly due to the x dependence of the parton distribution function (PDF). For fixed \(p_T,\) the gluon jet fraction therefore increases when \(\sqrt{s}\) is increased. This suggests an experimentally accessible way to define jet samples with different mixtures of quarks and gluons by varying \(\sqrt{s}.\) The main advantage of this approach is that the construction of an observable at different energies allows a single set of experimental cuts to be used to select jets, keeping all detector parameters unchanged, and, in this way, reducing many systematic errors. Let us provide one example of how the measurement can be biased when we use different selection criteria for quark and gluon samples [24, 33, 34]. Based on the colour factor we could naively expect the “quark” jets to be much narrower than the “gluon” jets. However, the Monte Carlo simulations show that a high \(p_T\) quark jet is narrower than a low \(p_T\) jet, biasing the entire measurement if we define quark- and gluon-enriched samples using different \(p_T\) ranges. At a more subtle level, even for jets at a given value of \(p_T\) and rapidity, it might be thought that a cut on the rapidity of the recoiling jet in the dijet pair could be used to vary the quark to gluon mix (the so-called “same-side opposite-side” method). However, as shown in [35], colour-coherence effects in the hard process mean that the properties of a quark or gluon jet of fixed kinematics (specifically, the amount of soft radiation into it) respond to the rapidity of the other jet and only in an inclusive sample are the jet properties of a given flavour independent of the collider energy, type, etc. The strategy we present here, by construction, will be almost free from such bias.

There are many ways to define quark and gluon-jet discrimination observables [9,10,11,12,13,14,15,16,17,18]; however, we follow [21] and use five generalised angularities \(\lambda ^{\kappa }_{\beta }\) [36]:

$$\begin{aligned} \begin{array}{cccccc} (\kappa ,\beta )&{}(0,0) &{} (2,0) &{} (1,0.5) &{} (1,1) &{} (1,2) \\ \lambda ^{\kappa }_{\beta }: &{} \text {multiplicity} &{} p_T^D &{} \text {LHA} &{} \text {width} &{} \text {mass}. \end{array} \end{aligned}$$

Here, multiplicity is the hadron multiplicity within the jet, \(p_T^D\) was defined in [37, 38],Footnote 1 LHA refers to the “Les Houches Angularity” (named after the workshop venue where this study was initiated [39]), width is closely related to jet broadening [40,41,42], and mass is closely related to jet thrust [43]. In general an angularity is defined as \(\lambda ^{\kappa }_{\beta } = \sum _{i \in \text {jet}} z_i^\kappa \theta _i^\beta ,\) where i runs over the constituents of the jet particles, \(z_i \equiv \frac{p_{Ti}}{\sum _{j \in \text {jet}} p_{Tj}} \in [0,1]\) is a transverse momentum fraction, \(\ \theta _i \equiv \frac{R_{i \hat{n}}}{R} \in [0,1]\) here \(R_{i \hat{n}}\) is the rapidity-azimuth distance to the jet axis and R is the jet-radius parameter.

Let \(\lambda \) denote an angularity in a jet from a mixed sample of quark and gluon jets and \(\lambda _i\) denote the value of one bin of a normalised histogram of \(\lambda .\) I.e. \(\lambda _i\) is the value of \(\frac{1}{n}\,\frac{{\textrm{d}}n}{{\textrm{d}}\lambda }\) averaged over the ith bin, where n is the number of jets. We can express \(\lambda _i\) as a linear combination of the angularity distribution in a gluon \(\lambda _{gi}\) and quark \(\lambda _{qi}\) jet:

$$\begin{aligned} \lambda _i = f\lambda _{gi}+(1-f)\lambda _{qi}. \end{aligned}$$
(2)

The coefficients are the fractions of gluon f and quark \((1-f)\) jets in the mixed sample. Let us consider that we measure just two similar samples of jets at two different energies \(s_1\) and \(s_2\) and we assume that \(\lambda _{gi}\) and \(\lambda _{qi}\) are independent of \(\sqrt{s}\) (we will return to this assumption later). We then obtain:

$$\begin{aligned} \lambda _{qi}=\frac{f^{s_1}\lambda ^{s_2}_i-f^{s_2}\lambda ^{s_1}_i}{f^{s_1}-f^{s_2}} \end{aligned}$$
(3)

and

$$\begin{aligned} \lambda _{gi}=\frac{(1-f^{s_2})\lambda ^{s_1}_i-(1-f^{s_1})\lambda ^{s_2}_i}{f^{s_1}-f^{s_2}}, \end{aligned}$$
(4)

where \(\lambda ^{s_1}_i\) and \(\lambda ^{s_2}_i\) are experimental measurements in mixed jet samples at \(\sqrt{s_1}\) and \(\sqrt{s_2},\) and \(f^{s_1}\) and \(f^{s_2}\) are fractions of gluon jets in the two samples. The jet fraction is provided by Monte Carlo simulation.

2.1 Event selection

To prepare the most efficient measurement strategy, we study the production of dijets at the LHC at different energies: 900 GeV, 2.36 TeV, 7 TeV and 13 TeV. The samples were generated using two different Monte Carlo generators: Herwig 7.2.2 [44, 45] with its default settings with PDF set MMHT2014lo68cl [46], Pythia 8.240 [47, 48] using its default settings with PDF set NNPDF2.3 QCD+QED LO [49] and the jets were reconstructed using the Anti-\(k_{T}\) algorithm [50] implemented in the FastJet package [51, 52]. We require exactly two jets that satisfy the following criteria:

$$\begin{aligned} p_{T~{\textrm{sublead}}} / p_{T~{\textrm{lead}}} > 0.8 \end{aligned}$$
(5)

and

$$\begin{aligned} (p_{T~{\textrm{lead}}}+p_{T~{\textrm{sublead}}})/2> p_T^{\textrm{cut}}, \end{aligned}$$
(6)

where \(p_{T~{\textrm{lead}}}\) is the transverse momentum of the leading jet and \(p_{T~{\textrm{sublead}}}\) is the transverse momentum of the subleading jet. We investigated five different jet radii \(R = 0.2, 0.4, 0.6, 0.8, 1.0,\) four different transverse momentum cuts \(p_T^{\textrm{cut}}\) \(=50,\) 100,  200 and 400 GeV. In addition to directly measuring the angularities, we also want to test the impact of jet grooming (see e.g. [53,54,55,56]). As one grooming example, we use the modified mass drop tagger (MMDT) with \(\mu = 1\) [53, 57] (equivalently, soft drop declustering with \(\beta = 0\) [58]) and \(z_{\textrm{cut}} = 0.1.\)

2.2 Example of deriving q/g multiplicity \(\lambda _{0}^{0}\) \((R = 0.4,\) \(p_T^{\textrm{cut}} = 100 \) GeV) using \(\sqrt{s}=\) 900–13000 GeV

To determine the multiplicity of charged particles in gluon and quark jets we perform the following steps:

Fig. 1
figure 1

Left panel: Gluon fractions obtain from Herwig’s simulation of proton–proton dijet process without hadronization and parton showering at \(\sqrt{s}= 900\) GeV \(f^{900}\) (blue solid line) and \(\sqrt{s}= 13000\) GeV \(f^{13000}\) (red solid line). Dashed lines show the chosen values \(f^{900}\) and \(f^{13000}\) for the point at the mean of the jet \(p_{T}\) distributions. Right panel: Normalised transverse momentum of the leading and subleading jets at energy 900 and 13000 GeV. Dashed lines represent the mean of the distributions used to evaluate the coefficients of the gluon fraction

2.3 Step 1: Derive gluon fraction – parton level simulation (i.e. without hadronization and parton shower)

By disabling hadronization and parton showering in the Monte Carlo generators Herwig 7 and Pythia 8 the gluon fraction was defined as a function of \(p_{T}\)

$$\begin{aligned} f(p_{T}) = \frac{N_{\textrm{gluons}}(p_{T})}{N_{\textrm{gluons}}(p_{T}) + N_{\textrm{quarks}}(p_{T})} \end{aligned}$$
(7)

where N represents the number of partons (quarks or gluons). In the left panel of Fig. 1 we show examples of gluon fractions as a function of transverse momentum \(f(p_{T})\) at \(\sqrt{s}=900\) GeV and 13000 GeV of Herwig (solid lines). We have also performed a more sophisticated approach with the distribution of gluon fractions as a 2D map in \(p_{T}\) and pseudorapidity \(\eta \) of a jet \(f(p_{T}, \eta ).\) However, no significant differences were observed in the resulting quark and gluon angularities. Therefore, for simplicity, the strategy of taking the mean of the jet \(p_{T}\) distributions is used to obtain numerical results in the following sections.

2.4 Step 2: Evaluate the scaling coefficients \(f^{900}\) and \(f^{13000}\)

In the right panel of Fig. 1 we show the \(p_{T}\) distributions of jets \((R = 0.4)\) that passed the event selection cuts defined in Sect. 2.1 obtained by running a complete Monte Carlo simulation (including hadronization and parton shower) at two different collision energies 900 and 13000 GeV. The transverse momentum mean \(\langle p_{T} \rangle \) of the jet distribution for the two energies is as follows:

  • jet \(p_{T}\) \((\sqrt{s}=900\) GeV) \(\rightarrow \) \(\langle p_{T} \rangle = 114.57\) GeV,

  • jet \(p_{T}\) \((\sqrt{s}=13\) TeV) \(\rightarrow \langle p_{T} \rangle = 125.63\) GeV.

The scaling coefficients \(f^{900}\) and \(f^{13000},\) as illustrated by the dashed lines in Fig. 1, are obtained using the gluon fractions of the left panel at the \(\langle p_{T} \rangle \) derived from the right panel of Fig. 1:

$$\begin{aligned} f^{900}= & {} f^{900}(\langle p_{T} \rangle ) = f^{900}(114.57~{\textrm{GeV}})=0.33, \end{aligned}$$
(8)
$$\begin{aligned} \!f^{13000}= & {} f^{13000}(\langle p_{T} \rangle ) = f^{13000}(125.63~{\textrm{GeV}})=0.73. \end{aligned}$$
(9)

2.5 Step 3: Derive q/g angularities

Since jet angularities require the simulation of Monte Carlo events at the hadron level, we store them while generating the events needed to obtain the mean transverse momentum in Step 2. Then the jet angularities are normalised to the number of jets (entries of the distribution). An example of jet angularity \(\lambda _{0}^{0}\) (multiplicity) at \(\sqrt{s}=\) 900 (green dashed line) and 13000 GeV (black solid line) is shown in Fig. 2. Having all ingredients \(f^{900},\) \(f^{13000},\) \(\lambda _{0i}^{0}(900~{\textrm{GeV}}),\) and \(\lambda _{0i}^{0}(13000~{\textrm{GeV}})\) we are able, using Eqs. 3 and 4, to derive quark and gluon multiplicities \(\lambda _{0}^{0}\):

$$\begin{aligned} \lambda _{qi}&=\frac{f^{s_1}\lambda _i^{s_2}-f^{s_2}\lambda _i^{s_1}}{f^{s_1}-f^{s_2}} =\frac{f^{13000}\lambda _i^{900}-f^{900}\lambda _i^{13000}}{f^{13000}-f^{900}} \nonumber \\&=\frac{0.73\lambda _i^{900}-0.33\lambda _i^{13000}}{0.73-0.33} =1.83\lambda _i^{900}-0.83\lambda _i^{13000}, \end{aligned}$$
(10)
$$\begin{aligned} \lambda _{gi}&=\frac{(1-f^{s_2})\lambda _i^{s_1}-(1-f^{s_1})\lambda _i^{s_2}}{f^{s_1}-f^{s_2}} \nonumber \\&=\frac{(1-f^{900})\lambda _i^{13000}-(1-f^{13000})\lambda _i^{900}}{f^{13000}-f^{900}} \nonumber \\&=\frac{(1-0.33)\lambda _i^{13000}-(1-0.73)\lambda _i^{900}}{0.73-0.33} \nonumber \\&=\frac{(0.67)\lambda _i^{13000}-(0.27)\lambda _i^{900}}{0.73-0.33} \nonumber \\&=1.68\lambda _i^{13000}-0.68\lambda _i^{900}. \end{aligned}$$
(11)

Figure 2 illustrates that the result \(\lambda _{qi}\) (red line) and \(\lambda _{gi}\) (blue line) are linear combinations of jet angularities measured at different energies.

3 Robustness of observables to systematic effects

Fig. 2
figure 2

Derived distributions of quark and gluon angularity (multiplicity) \(\lambda _q\) (red line) and \(\lambda _g\) (blue line) as linear combinations of those measured at different energies (green and black lines)

Fig. 3
figure 3

Quark and gluon multiplicities \(\lambda _{0}^{0}\) \((R = 0.4,\) \(p_T^{\textrm{cut}} = 100 \) GeV) for all six energy combinations (above) and averaged plot showing the envelopes of the different energy combinations as filled areas and their statistical uncertainties as ticks (below)

Fig. 4
figure 4

Classifier separation \(\varDelta _{[q,g]}\) as a function of the jet radius (left panel) and as a function of \(p_{T}^{\textrm{cut}}\) (right)

An important question to consider is whether the q/g angularities obtained in our study are robust to the impact of multiparton interactions (MPI) and initial state radiation (ISR). To address this, we performed supplementary Monte Carlo simulations for each observable examined, where we excluded the effects of MPI and ISR. This enabled us to evaluate the robustness of the q/g angularities obtained. Another crucial aspect to consider is the extent to which the q/g angularities remain independent of the energy. To test the energy independence, we use six possible energy combinations (900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV) to derive angularities in the same way as described in the Example 2.2. Therefore, there are in total 24 distributions (12 for quark and 12 for gluon jet angularities) that we need to analyse per plot. In the top panel of Fig. 3 we present an example of this plot showing, similar to Fig. 2, the multiplicity \(\lambda _{0}^{0}\) \((R = 0.4,\) \(p_T^{\textrm{cut}} = 100 \) GeV) of the quark jets (red lines) and the gluon jets (blue lines). This time, the display includes angularities obtained from all energy combinations using full simulation, which are denoted by various types of lines. Additionally, the dots represent the angularities derived with MPI and ISR turned off. In order to simplify this plot, the bottom panel shows the same observable but this time solid lines represent the averaged q/g angularities across different energy combinations, while the filled area depicts the envelope of the different energy combinations, and the ticks represent the envelope of the statistical uncertainties of the angularities. By comparing this graph with Fig. 2, we can gain additional insight into how the observables are robust to these important systematic effects. The averaging not only simplifies the detailed plots but also allows us to define measures (shown in the subhistogram with 7 bins), which help to sort angularities based on their performance. These measures are discussed in the next section.

Fig. 5
figure 5

Classifier separation \(\varDelta _{[q,g]}\) as a function of angularities

4 Quantifying the quark/gluon separation power

Since we will be testing many variants of observables, we need a way to quantify the quark/gluon separation power in a robust way that can be easily summarised by a single number. For example, in [36], the authors quantify discrimination performance in the context of quark/gluon jets using classifier separation \(\varDelta _{[q,g]}\) (a default output of the TMVA package [59]):

$$\begin{aligned} \varDelta _{[q,g]} =\frac{1}{2} \sum _{i=1}^{N} \frac{(\lambda _{q_{i}}-\lambda _{g_{i}})^2}{\lambda _{q_{i}}+\lambda _{g_{i}}}. \end{aligned}$$
(12)
Fig. 6
figure 6

Quark and gluon averaged angularities \(\lambda _{0}^{0},\) \(R = 1.0\) with score \(\varDelta _{\textrm{comb}}=0.\) Using Herwig event generator, with \(p_T^{\textrm{cut}} = 50 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV. In the subpad, the columns from left to right show Delta (1st column), DelQ NM and DelG NM (2nd and 3rd columns), \({\mathrm {NegQ~DN}}\) and \({\mathrm {NegG~DN}}\) (4th and 5th columns), DelQ UD and DelG UD (6th and 7th columns)

Fig. 7
figure 7

Quark and gluon averaged angularities \(\lambda _{2}^{1},\) \(R = 0.4\) with score \(\varDelta _{\textrm{comb}}=24769.\) Using Herwig event generator, with \(p_T^{\textrm{cut}} = 400 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV. In the subpad, the columns from left to right show Delta (1st column), DelQ NM and DelG NM (2nd and 3rd columns), \({\mathrm {NegQ~DN}}\) and \({\mathrm {NegG~DN}}\) (4th and 5th columns), DelQ UD and DelG UD (6th and 7th columns)

Here N denotes the number of bins, \(\lambda _{q_{i}}\) \((\lambda _{g_{i}})\) is the i-th bin contentFootnote 2 of the probability distribution for the quark jet (gluon jet) sample as a function of the classifier \(\lambda .\) \(\varDelta _{[q,g]} = 0\) corresponds to no discrimination power (the distributions are exactly the same) while \(\varDelta _{[q,g]} = 1\) corresponds to perfect discrimination power. In Fig. 4 (left panel), we show the separation of the classifier \(\varDelta _{[q,g]}\) as a function of the jet radius for all the energy-averaged angularities studied. We see that the separation power increases with increasing jet radius. This can be intuitively understood, since in larger jets more information about the radiation pattern is contained, which should be different for quark and gluon jets. Similarly, in Fig. 4 (right panel) we see that the separation power decreases with increasing \(p_T^{\textrm{cut}}\). Finally, it is clear from Fig. 5 that although several individual cases of other angularities have high separation, it is multiplicity that scores the highest in most cases. According to this measure, the best observable is the multiplicity \(\lambda _0^0\) with \(R=1.0\) for \(p_T^{\textrm{cut}} = 50 \) GeV that is shown in Fig. 6. As we can see from the figure, despite a large separation power, this observable suffers from various problems. First, it is not robust to MPI and ISR effects, and is also very energy dependent, which leads to unphysical negative bins of the probability distributions. For this reason, it is clear that this single measure is not suitable for our problem; therefore, we introduce additional measures to evaluate the robustness to the important systematic effects. To check the robustness of an observable to MPI and ISR effects, we calculated the separation power between the quark (gluon) angularity obtained with and without MPI and ISR i.e.: \(\varDelta _{[q, \;q{\mathrm {~no~MPI~no~ISR}}]}\) \((\varDelta _{[g,\;g{\mathrm {~no~MPI~no~ISR}}]}).\) If these values are close to zero, then the observable is not sensitive to MPI and ISR effects. Similarly, to determine the extent to which the observable is energy independent, we calculated its separation power between the upper boundary and the lower boundary of the energy envelope of an angularity (see, for example, the energy envelope in Fig. 6). For quark (gluon) angularity, we denote this measure by \(\varDelta _{[q(s)_{\textrm{UP}}, \;q(s)_{\textrm{DOWN}}]}\) \((\varDelta _{[g(s)_{\textrm{UP}}, \;g(s)_{\textrm{DOWN}}]})\) and its value close to zero means that the observable is energy independent. Finally, to measure whether the reconstructed observable suffers from the fact that part of it is negative, we calculate the percentage of negative area of down variation of the uncertainty band of quark (gluon) angularity and call it quark (gluon) negativity. We show the distributions of all of these measures in the Appendix (including repetition of Figs. 4 and 5 for completeness). We see that the MPI and ISR affect larger radius jets more than smaller (Fig. 41), as would be expected, and multiplicity much more than the other angularities (Fig. 42). Its effect gets somewhat less important at higher \(p_T^{\textrm{cut}}\) (Fig. 43). The energy-dependence shows a rather similar behaviour (Figs. 47, 48 and 49), except that the smallest jet radius, 0.2, shows a strong dependence (Fig. 47). The energy-dependence of the multiplicity distributions is again strong (Fig. 48). A strong energy dependence or MPI and ISR dependence also leads to a significant amount of negativity in the distributions, and the negativity tends to follow these same patterns (Figs. 44, 45 and 46).

Fig. 8
figure 8

Quark and gluon averaged angularities \(\lambda _{0}^{2},\) \(R = 0.8\) with highest score \(\varDelta _{\textrm{comb}}=37979.\) Using Herwig event generator, with \(p_T^{\textrm{cut}} = 400 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV

Fig. 9
figure 9

Quark and gluon averaged angularities \(\lambda _{0}^{0},\) \(R = 0.2\) with score \(\varDelta _{\textrm{comb}}=36875.\) Using Herwig event generator, with \(p_T^{\textrm{cut}} = 200 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV

Fig. 10
figure 10

Quark and gluon averaged angularities \(\lambda _{2}^{1},\) \(R = 0.8\) with score \(\varDelta _{\textrm{comb}} = 36291.\) Using Herwig event generator, with \(p_T^{\textrm{cut}} = 400 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV

Fig. 11
figure 11

Quark and gluon averaged angularities MMDT \(\lambda _{0.5}^{1},\) \(R = 0.8\) with score \(\varDelta _{\textrm{comb}} = 36185.\) Using Herwig event generator, with \(p_T^{\textrm{cut}} = 400 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV

Scores for a given observable of each metric, including \(\varDelta _{[q,g]}\), are shown as red columns in the subpad, see, for example, Fig. 6. Each column represents the percentiles of a measure of angularities in all \(p_T^{\textrm{cut}}\) regions 50, 100, 200, and 400 GeV. The higher the column, the better the performance of the characteristic; for example, if the first column denoted by Delta in Fig. 6 is the highest (100%) it means that no other angularity has a higher separation power \(\varDelta _{[q,g]}.\) Similarly, if the other columns are high, it means that the corresponding measures are good, i.e. have low values. Successively, the columns from left to right show the percentiles of \(\varDelta _{[q,g]}\) (1st column – Delta), the percentiles of \(\varDelta _{[q,~q~{\mathrm {no~MPI~no~ISR}}]}\) and \(\varDelta _{[g,~g~{\mathrm {no~MPI~no~ISR}}]}\) (2nd – DelQ NM and 3rd – DelG NM columns), the percentiles of quark and gluon negativity (4th – NegQ DN and 5th – NegG DN columns), the percentiles of \(\varDelta _{[q(s)_{\textrm{UP}}, \;q(s)_{\textrm{DOWN}}]}\) and \(\varDelta _{[g(s)_{\textrm{UP}}, \;g(s)_{\textrm{DOWN}}]}\) (6th - DelQ UD and 7th - DelG UD columns). From the subpanel of Fig. 6, we can read that the multiplicity \(\lambda _0^0\) with \(R=1.0\) for \(p_T^{\textrm{cut}} = 50 \) GeV has the best separation power (1st column) but the fact that all other columns (2–7) are very low show that this is amongst the worst observables on these metrics. On the other hand, we can also have examples of observables that are very robust to all systematic effects, but have minimal separation power; see, for example, Fig.7. Therefore, in the next section, we will provide selections of plots which have strong robustness to systematic effects and have a high separation power.

5 Results

In studying these different observables, we have generated a huge number of distributions, since we consider all combinations of:

  • 5 – angularities \(\lambda _0^0,\) \(\lambda _{0.5}^1,\) \(\lambda _1^1,\) \(\lambda _0^2,\) \(\lambda _2^1\)

  • 2 – using groomed (MMDT)/not groomed jets

  • 5 – jet radii \(R = 0.2, 0.4, 0.6, 0.8, 1.0\)

  • 4 – regions-dijet average \(p_T^{\textrm{cut}} = 50 \) GeV, 100, 200, and 400 GeV

  • 2 – quark/gluon

  • 2 – MPI and ISR switched on/off

  • 6 – energy combinations: 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV

  • 2 – event generators Herwig and Pythia.

Fig. 12
figure 12

Quark and gluon averaged angularities \(\lambda _{0}^{0},\) \(R = 0.2\) with score \(\varDelta _{\textrm{comb}}=35699.\) Using Herwig event generator, with \(p_T^{\textrm{cut}} = 200 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV

Fig. 13
figure 13

Quark and gluon averaged angularities \(\lambda _{2}^{1},\) \(R = 0.4\) with score \(\varDelta _{\textrm{comb}} = 34188.\) Using Herwig event generator, with \(p_T^{\textrm{cut}} = 100 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV

Fig. 14
figure 14

Quark and gluon averaged angularities MMDT \(\lambda _{0.5}^{1},\) \(R = 0.4\) with score \(\varDelta _{\textrm{comb}} = 34533.\) Using Herwig event generator, with \(p_T^{\textrm{cut}} = 100 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV

Fig. 15
figure 15

Quark and gluon averaged angularities \(\lambda _{1}^{1},\) \(R = 0.2\) with score \(\varDelta _{\textrm{comb}} = 32746.\) Using Herwig event generator, with \(p_T^{\textrm{cut}} = 100 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV

This results in 200 plots for each of the two generators, each containing four distributions, each with an envelope of six different energy combinations, or 9600 distributions in total. Our aim is to sort through these to find the best performing combinations. Clearly to do so, we need a quantitative quality measure.

With the aim of picking the best candidates, maximising the separation power while also considering the other quality measures, we define the combined measure \(\varDelta _{\textrm{comb}}\) as:

$$\begin{aligned} \varDelta _{\textrm{comb}}&= 1000 \cdot \ln \Big [ 1 +({\textrm{Delta}})^{3}\cdot ({\mathrm {DelQ~NM}}) \nonumber \\&\cdot \,({\mathrm {DelG~NM}}) \cdot ({\mathrm {NegQ~DN}}) \cdot ({\mathrm {NegG~DN}}) \nonumber \\&\cdot \,({\mathrm {DelQ~UD}}) \cdot ({\mathrm {DelG~UD}}) \Big ] \end{aligned}$$
(13)

where the power of three enhances the separation power \(\varDelta _{[q,g]}.\) Each of the inputs into this formula is the percentile of the corresponding variable, i.e. the red bars in the inset. The addition of 1 is mainly only relevant to ensure that if a given observable is the worst on any single measure, it will be given an overall score of zero, and is otherwise unimportant. If there was one observable that was the best on every criterion, it would score the maximum possible valueFootnote 3 of \(\varDelta _{\textrm{comb}}=41447.\) We focus initially on the results from Herwig and return to the comparison with Pythia in the next section.

Figures 891011 and 16 represent the best selection based on \(\varDelta _{\textrm{comb}}\) score for each type of angularity.

Fig. 16
figure 16

Quark and gluon averaged angularities \(\lambda _{1}^{1},\) \(R = 0.6\) with score \(\varDelta _{\textrm{comb}} = 35785.\) Using Herwig event generator, with \(p_T^{\textrm{cut}} = 400 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV

In addition to these best of each type, we have also selected four others, shown in Figs. 121314 and 15 based more on giving a representation of a typical range of good results, even though the score \(\varDelta _{\mathrm {comb.}}\) is not the highest. They generally show a good quark/gluon jet separation, even if they suffer from lower robustness to variations without MPI and ISR or negativity, etc.

The high-scored angularities presented in the plots provide compelling evidence supporting the assumption that quark and gluon angularities remain independent of collision energy. This conclusion is drawn from the relatively narrow envelope of the filled area, which represents angularities derived at different energy combinations. The consistency observed across these plots strongly suggests that quark and gluon angularities are not significantly affected by changes in collision energy. However, it is essential to acknowledge that this assumption may not be entirely valid for all angularities, as shown in Fig. 6 where the filled area is broader. Therefore, it is important to consider this uncertainty when interpreting the results.

5.1 Comparison with Pythia

We have rerun the preceding analysis using the Pythia event generator in place of Herwig. The results are very similar in almost all cases. As an example, we show the angularity that has the highest \(\varDelta _{\textrm{comb}}\) score in Fig. 17.

Fig. 17
figure 17

Quark and gluon averaged angularities \(\lambda _{0}^{2},\) \(R = 0.8\) with highest score \(\varDelta _{\textrm{comb}}=37979\) using Herwig event generator (left) and \(\varDelta _{\textrm{comb}}=37031\) using Pythia event generator (right), with \(p_T^{\textrm{cut}} = 400 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV

We also show, in Fig. 18, the example from all those we have studied that shows the biggest difference between Herwig and Pythia. In general, and most noticeably in Fig. 18, Pythia shows slightly more energy- and MPI/ISR-dependence than Herwig, and as a result slightly more negativity. Although for most of the robust observables this is a small effect, it is possible that it could also be used to constrain the underlying models, by studying observables where it is significant.

Fig. 18
figure 18

Quark and gluon averaged angularities MMDT \(\lambda _{0.5}^{1},\) \(R = 0.8\) with score \(\varDelta _{\textrm{comb}} = 36185\) using Herwig event generator (left) and \(\varDelta _{\textrm{comb}} = 34916\) using Pythia event generator (right), with \(p_T^{\textrm{cut}} = 400 \) GeV, using the average of 6 energy combinations 900–2360, 900–7000, 900–13000, 2360–7000, 2360–13000, 7000–13000 GeV

6 Conclusion

We have shown that several jet angularity observables can be robust in measuring the properties of jets and successful in yielding significantly different distributions for quark and gluon jets. And, moreover, that the energy-dependence of the distributions can be used at the LHC to separate the two on a statistical basis. The method relies crucially on the assumption that the angularities for quark and gluon jets are separately independent of \(\sqrt{s},\) and we have shown this to be the case, particularly for higher \(p_T\) jets. Only for multiplicity in large-radius jets are the uncertainties too high to be useful.

Of course, the LHC will not make special runs at different energies only to conduct the proposed measurement. Therefore, we should use data already recorded, or data that will soon be measured at the LHC. During its early startup phase, CMS recorded some minimum-bias events of proton–proton collisions at energies \(\sqrt{s} = 900\) GeV and \(\sqrt{s} = 2360\) GeV. In the publication [60], they presented properties of inclusive jets and dijet events measured in these samples. However, the number of events with jet \(p_T > 8\) GeV or 10 GeV was less then 1000 or 200,  at \(\sqrt{s} = 900\) GeV and \(\sqrt{s} = 2360\) GeV respectively. Due to the low statistics and the low \(p_T\) jet cut, these data samples are not optimal for using our method. For this reason, carrying out the proposed measurement at the LHC at energies of 7 and 13 TeV, where low statistics will not be an issue, would be a much better strategy. It is also worth mentioning that the ALICE experiment has also measured jets at energies \(\sqrt{s} = 2360\) GeV and 5.02 TeV. Jet measurements at these energies and in particular at \(\sqrt{s} = 5.02\) TeV where the number of jets measured is high could serve to check the energy independence of the quark/gluon jet measurement. Moreover, recently ALICE published results [61, 62] of jet substructure including jet angularities carried out by the experiment using data recorded at the LHC from pp collisions at \(\sqrt{s} = 5.02\) or 13 TeV. It is possible that these data could be reanalysed to obtain the first results using the method proposed here. Another possibility could be to use CERN Open Data [63] similarly to what was done for jet topics [20].

One interesting extension of this research would be to use the recently developed IRC-safe flavoured jet algorithms [64,65,66,67,68,69]. Knowing the flavour of jets could be used to construct the fraction of jets after the evolution of the parton shower, which could help to trace the origin of the jets through hadronisation and parton showering, and finally be used to develop a Machine Learning method to optimise the q/g classification strategy. Another interesting extension would be to study different jet production processes, such as vector boson plus jet, at different collision energies. Not only is the flavour mix different in these processes, but also the colour structure. Yet another intriguing possibility would be to apply the variable collision energy samples to the above-mentioned jet topics.