1 Introduction

One of the most important and, at the same time, difficult challenges that we have to tackle in order to accurately describe fundamental particles and their interactions in high-energy collisions is posed by strong interactions. To date, the most effective way we have to address this issue relies on factorisation, i.e. on the ability of separating long-distance, non-perturbative, physics of both initial- and final-state hadrons, from the hard interaction, which can be tackled exploiting perturbative quantum field theory. Although very successful, this approach inevitably introduces complications. For instance, we have to map, in a quantitative way, the measurable degrees of freedom, e.g., the colliding protons or the final-state hadrons which are reconstructed by the experimental apparatuses, to the partonic degrees of freedom, i.e., quarks and gluons, that we use to describe hard interactions.

While it is clear that quarks and gluons are not measurable degrees of freedom, many physics analyses at the CERN Large Hadron Collider (LHC), are often designed having in mind the imprint left by particular partonic flavour on a measurable final-state object, such as a hadron or a jet. Furthermore, the issue of jet flavour acquires particular relevance when discussing heavy quarks. In this case indeed one can meaningfully assign a flavour-label to a jet exploiting kinematic features of D and B meson decays, such as, for instance, displaced vertices.

Naively one would be tempted to call jet flavour the net flavour of the jet after the generalised \(k_T\)-clustering [1,2,3,4,5,6], i.e., simply to compute the total number of quarks minus anti-quarks for each quark flavour. However, this procedure is not infrared and collinear (IRC) safe at next-to-next-to leading order (NNLO), as pointed out in [7] (BSZ). The problematic configuration is depicted in Fig. 1, where we show an \({\mathcal {O}}(\alpha _s^2)\) configuration characterised by the emission of a soft gluon, which splits into a quark–anti-quark pair, \(q{\bar{q}}\). In this configuration, the jet algorithm clusters together the hard quark Q and q and so the jet flavour is determined by the soft gluon splitting, rendering the flavour assignment IRC unsafe. BSZ solved this problem by modifying the metric of the clustering algorithm so that clustering of soft pairs are favoured only if the softer parton is flavoured. This so-called flavour-\(k_T\) algorithm has been used in precision calculations [8, 9] (see also [10]).

However, the use of the flavour-\(k_T\) algorithm in experimental analysis is far from straightforward. The main complication arises from the fact that the clustering metric requires knowledge of the flavour of the objects combined at every stage of the clustering. This clashes with the experimental procedure of assigning the flavour label to the jet after clustering, rather than to the jets’ constituents. Furthermore, the LHC experimental collaborations have put a lot of effort in standard jet calibration and may be somewhat reluctant to dramatically change their jet definition strategies. Despite the fact that some practical solutions to these problems have been suggested in [11], nowadays comparison to high-precision QCD calculations with LHC data typically involve unfolding corrections that bridge the gap between the theoretical and experimental jet definitions, see for instance [8] for the case of b-jets. These corrections are typically derived using Monte Carlo simulations and, while in some cases turn out to be modest, they put us in an uneasy situation because they degrade the theoretical accuracy of the calculation.

In this study we suggest an alternative approach to jet flavour assignment, which is IRC safe through NNLO, while maintaining experimental viability. Our suggestion stems from the realisation that the problematic configuration depicted in Fig. 1 is analogous to the same configuration responsible for non-global logarithms [12] in jet shape observables. In fact, similar to non-global logarithms, it is possible to eliminate the infrared ambiguities of jet flavour with Soft Drop (SD) grooming [13]. There are, however, two restrictions or modifications to SD that we must implement to ensure IRC safety of jet flavour through NNLO. First, the angular exponent \(\beta \) in the grooming constraint must be greater than 0 for IRC safety of jet flavour at next-to-leading order (NLO). This restriction eliminates the modified Mass Drop Tagging Groomer (mMDT) [14, 15] for use in defining IRC safe flavour. It is also known that the mMDT jet energy or transverse momentum to the beam is also not IRC safe, for similar reasons [13, 16].

Secondly, the clustering algorithm [17, 18] must be used with SD to properly order and groom soft emissions that can render the jet flavour ambiguous. SD or mMDT was originally introduced to groom emissions in a jet ordered in relative angle, with the Cambridge/Aachen algorithm [3,4,5]. When typical IRC safe observables are measured, like the mass, any generalised \(k_T\) algorithm can be used to order emissions in the jet and then groom with SD and produce IRC safe results upon grooming. However, we will explicitly demonstrate that there exist orderings of partons in a jet at NNLO with generalised \(k_T\) whose flavour cannot be made IRC safe through SD grooming. cures this problem by always clustering soft quarks together first, which then can be eliminated from the jet by SD. We note that it has been observed that jet flavour is IRC safe through NNLO for jets clustered with the algorithm [7]. However, the way that is used here is distinct: a jet is defined by whatever algorithm the user desires. Then, the emissions in the identified jet are reclustered with and SD groomed. We prove that the jet flavour defined by the sum of particles that remain after this grooming procedure is IRC safe through NNLO. In a companion paper [19], we exploit perturbative fragmentation functions to introduce a definition of jet flavour that is soft safe, but not collinear safe, for jets with arbitrary numbers of constituents.

In this paper, we explicitly consider jet production in \(e^+e^-\) collisions. However, our analytic calculations are performed in the collinear limit, which is universal. Therefore, our results can be applied to hadron-hadron collisions, with appropriate changes of detector coordinates.

The outline of this paper is as follows. In Sec. 2, we study jet flavour after grooming with SD and generalised \(k_T\) reclustering at NLO and NNLO and show that IRC safety fails at NNLO. In Sect. 3, we modify the SD groomer with reclustering, and argue that the flavour defined by this algorithm is IRC safe through NNLO. Section 4 reports fixed-order numerical calculations obtained using [20], validating the IRC safety of the flavour algorithm. We conclude in Sect. 5. Explicit definitions of the SD grooming algorithm, the Durham \(k_T\) jet algorithm [21], and the BSZ flavour-\(k_T\) algorithm are presented in appendices.

2 Jet flavour from grooming

We start our discussion about SD flavour by considering the leading-order (LO) and NLO situations first. Through NLO, even standard jet algorithms provide an IRC safe definition of jet flavour, but it is still interesting to work through the calculation in order to understand the role of grooming. Let us consider, for definiteness, a jet initiated by a quark. For simplicity, we will assume that the jet radius R is small, so we can work in the collinear limit. We will calculate the jet flavour as defined by the sum of partonic flavour of the more energetic jet.

At LO, relative \(\mathcal{O}(\alpha _s^0)\), there is only one particle in the jet, the initiating quark, which trivially passes the SD condition and so the jet has quark flavour. Thus, there is 0 probability that the jet to this order is gluon flavour and the jet flavour fractions to this order are

$$\begin{aligned} P_q = 1+\mathcal{O}(\alpha _s), \quad P_g = 0+\mathcal{O}(\alpha _s)\,. \end{aligned}$$
(1)

2.1 Soft drop flavour at NLO

At NLO, relative \(\mathcal{O}(\alpha _s)\), the quark can emit a real gluon through \(q \rightarrow q g\) splitting. It is easier to first determine the configurations that lead to assigning gluon flavour to the jets. This can happen for two reasons: (a) either the quark and the gluon are not recombined in the same jet (i.e., the angle \(\theta \) between them is bigger than R) and the gluon jet is more energetic (i.e., the gluon momentum fraction \(z>\frac{1}{2}\)) or (b) the two partons are recombined in the same jet but the quark fails the SD condition and it is groomed away. In the former case we have

$$\begin{aligned} P_g^{(a)}&=\frac{\alpha _s C_F}{2\pi } \int _{R^2}^1\frac{d\theta ^2}{\theta ^{2}} \int _{1/2}^1 dz\, \frac{1+(1-z)^2}{z} \nonumber \\&=\frac{\alpha _s C_F}{2\pi }\, \log R^2 \left( \frac{5}{8}-2\log 2 \right) \,. \end{aligned}$$
(2)

In the evaluation of this result, we only keep the leading terms in the \(R^2\ll 1\) limit.

For the latter case, the quark and the gluon live in the same jet but the quark is groomed away by SD. A review of the SD grooming algorithm is presented in Appendix A.1 To fail SD, the quark’s energy fraction \(1-z\) and its splitting angle from the gluon \(\theta \) must satisfy

$$\begin{aligned} z_{\mathrm{cut}}\left( \frac{\theta ^2}{R^2} \right) ^\beta >1-z\,, \end{aligned}$$
(3)

where we have assumed that \(\theta ,R\ll 1\). Then, the probability that the quark is groomed away is

$$\begin{aligned} P_g^{(b)}= & {} \frac{\alpha _s C_F}{2\pi }\int _0^{R^2}\frac{d\theta ^2}{\theta ^{2}} \int _0^1dz\, \frac{1+(1-z)^2}{z}\,\Theta \left( z_{\mathrm{cut}}\left( \frac{\theta ^2}{R^2} \right) ^\beta \right. \nonumber \\&\quad \left. -(1-z)\right) \nonumber \\= & {} \frac{\alpha _s C_F}{2\pi }\frac{z_{\mathrm{cut}}}{\beta }, \end{aligned}$$
(4)

to leading order in \(z_{\mathrm{cut}}\ll 1\).

In total, the gluon and quark flavour fractions defined by SD grooming through \(\mathcal{O}(\alpha _s)\) are:

$$\begin{aligned} P_g&= P_g^{(a)}+ P_g^{(b)}=\frac{\alpha _s C_F}{2\pi }\, \left( \left( \frac{5}{8}-2\log 2 \right) \log R^2 + \frac{z_{\mathrm{cut}}}{\beta } \right) \nonumber \\&\quad +\mathcal{O}(\alpha _s^2)\,, \nonumber \\ P_q&= 1-P_g= 1+\frac{\alpha _s C_F}{2\pi }\left( \left( 2\log 2 - \frac{5}{8} \right) \log R^2-\frac{z_{\mathrm{cut}}}{\beta } \right) \nonumber \\&\quad +\mathcal{O}(\alpha _s^2)\, . \end{aligned}$$
(5)

This is finite for \(\beta > 0\), but is not for \(\beta = 0\). Thus, jet flavour, even at \(\mathcal{O}(\alpha _s)\), is only IRC safe for SD with \(\beta > 0\). This is not a surprise, because when used as a groomer, SD with \(\beta =0\), i.e., the mMDT groomer, is only IRC safe when the substructure of the jet is resolved.

2.2 Soft drop flavour at NNLO

At \({\mathcal {O}}(\alpha _S^2)\) there are several configurations of particles that must be considered. In fact, both \(q\rightarrow qg\rightarrow qgg\) and \(q\rightarrow qg\rightarrow q\,q'\,{\bar{q}}\,'\) splittings contribute, where in principle \(q'\,({\bar{q}}\,')\) is a (anti)quark with a different flavour with respect to the initial quark q. The problematic configuration, pointed out by BSZ and represented by Fig. 1, belongs to the second case. In this picture we labeled the hard initial quark as Q while the soft quark–anti-quark pair has been labeled as \(q\bar{q}\), and we keep this naming convention in what follows. The configuration represented in Fig. 1 spoils naive IRC safety of jet flavour because only one of the soft quark–anti-quark pair emitted by the soft gluon will be in the partonic content of the final jet varying its net flavour. In this section, we will discuss how default SD cures this issue, but still fails to be IRC safe through NNLO when emissions in the jet are ordered with a generalized \(k_T\) algorithm. This will motivate the introduction of a modification to SD in the next section, in which emissions are reclustered with the algorithm, and we will prove that it results in IRC safe jet flavour through NNLO.

2.2.1 Elimination of soft quark ambiguities

The configuration in Fig. 1 in which the dashed oval represents the jet boundary is essentially the same configuration of particles that are the leading contribution to non-global logarithms (NGLs) [12]. Though at NNLO, the jet consists of only two particles, and so the implementation of SD on the jet is identical to that at NLO. The softer of the two constituents of the jet is eliminated by the groomer if it fails the SD constraint. With a finite value of \(z_{\mathrm{cut}}\) and \(\beta \), an arbitrarily soft quark q will always fail the SD constraint, and so after grooming the jet will consist exclusively of the hard quark Q.

Fig. 1
figure 1

The configuration that renders jet flavour definition infrared unsafe at NNLO is depicted: a quark Q emits an intermediate soft gluon that subsequently splits into a quark–anti-quark \(q{\bar{q}}\) pair. Only one of the gluon’s decay products, say q, is clustered with the original quark Q and so the jet flavour is determined by soft physics. Note that the dotted oval can either represent the boundary of the original jet or the effective boundary induced by SD

Thus, in the soft limit, the jet flavour would be identified as the same flavour as Q, which is also the flavour of the jet from corresponding virtual corrections. Thus, this configuration has no infrared ambiguities.Footnote 1

Further, because of the relationship to NGLs, all-orders statements about the jet flavour from this configuration can be made. It has been proven that SD and mMDT grooming eliminate NGLs of observables like the jet mass to all orders in perturbation theory [13, 14, 22]. NGLs arise from soft particles that are sensitive to the boundary of the jet. Correspondingly, the jet flavour as defined by application of SD has no infrared divergences arising from soft emissions near the boundary of the jet. By contrast, SD is inclusive over collinear emissions at the jet center, and we will demonstrate that this feature is problematic for jet flavour.

2.2.2 Failure of IRC safety of SD with \(k_T\) clustering

In the original and most widely-studied definitions of SD grooming, emissions in the jet are re-clustered with a generalised \(k_T\) algorithm, typically the Cambridge/Aachen (C/A) algorithm [3,4,5] in which emissions are ordered by their relative angle. While this prescription does eliminate the NGL-like infrared ambiguities in jet flavour, reclustering with a \(k_T\)-like algorithm means that the emission that first passes the groomer sets an effective jet radius below which all emissions are still included in the jet. Thus, a configuration as illustrated in Fig. 1 can still exist, where now the dashed oval represents the effective groomed jet region. That is, grooming can eliminate a soft, wider-angle anti-quark from the jet, but render the jet flavour ambiguous because a soft quark passes the groomer. In this section, we will make this precise, and explicitly demonstrate that a default implementation of the SD groomer still fails to give an IRC safe jet flavour at NNLO.

For simplicity of expressions, we will restrict our analysis here to consideration of C/A clustering of emissions in the jet. Our jet of interest will initially consist of a hard quark Q and a soft quark–anti-quark pair \(q{\bar{q}}\) from intermediate gluon emission. Then, on this collection of particles we will groom with SD, necessarily assuming that the angular exponent \(\beta > 0\) to ensure IRC safety at NLO. Then, the SD constraint represented by Fig. 1 in which the anti-quark is at a wider angle than the quark to Q and the anti-quark fails the groomer while the quark passes is

$$\begin{aligned}&\Theta _\text {SD}^\text {C/A} \nonumber \\&\quad = \Theta (\theta ^2_{Q{\bar{q}}}-\theta ^2_{Qq})\Theta (\theta ^2_{q{\bar{q}}}-\theta ^2_{Qq})\nonumber \\&\qquad \Theta \left( z_q-z_{\mathrm{cut}}\left( \frac{\theta _{Qq}^2}{R^2}\right) ^\beta \right) \Theta \left( z_{\mathrm{cut}}\left( \frac{\theta _{Q{\bar{q}}}^2}{R^2}\right) ^\beta -z_{{\bar{q}}}\right) ,\nonumber \\ \end{aligned}$$
(6)

where we assume that the soft quark and anti-quark energy fractions \(z_q,z_{{\bar{q}}}\ll 1\). Pairwise angles between particles are labeled; i.e., \(\theta _{Qq}\) is the angle between the hard quark Q and the soft quark q. The first two \(\Theta \) functions are the implementation of the C/A clustering, while the latter two \(\Theta \) functions are the SD groomer constraints on the soft quark and anti-quark.

To isolate the problematic, IRC-unsafe configuration, we can rescale the energy fractions in the collinear limit as

$$\begin{aligned}&z_q = x_q z_{\mathrm{cut}}\left( \frac{\theta _{Qq}^2}{R^2}\right) ^\beta \,,&z_{{\bar{q}}}= x_{{\bar{q}}} z_{\mathrm{cut}}\left( \frac{\theta _{Q{\bar{q}}}^2}{R^2}\right) ^\beta \,, \end{aligned}$$
(7)

for some new quantities \(x_q,x_{{\bar{q}}}\). Assuming that the ratio of the angles with respect to the hard quark stays constant in approaching the collinear limit, \(\theta _{Qq}\lesssim \theta _{Q{\bar{q}}}\ll R\), this change of variables exposes the collinear singularity. In this limit, the matrix element \(\vert \mathcal{M}(z_q,z_{{\bar{q}}})\vert ^2\) can be thought of as essentially the triple collinear splitting function, for which we give an explicit expression in Appendix B. Under the above rescaling the matrix element and differential phase space measure \(d\Pi _3\) are only modified by an order-1 amount set by the ratio \(\theta _{Qq}/\theta _{Q{\bar{q}}}\) in the soft and collinear limit:

$$\begin{aligned}&d\Pi _3\, \vert \mathcal{M}(z_q,z_{{\bar{q}}})\vert ^2\, \Theta _\text {SD}\nonumber \\&\quad \simeq d\Pi _3\, \vert \mathcal{M}(x_q,x_{{\bar{q}}})\vert ^2\, \Theta (\theta ^2_{Q{\bar{q}}}-\theta ^2_{Qq})\Theta (\theta ^2_{q{\bar{q}}}-\theta ^2_{Qq})\nonumber \\&\qquad \Theta \left( x_q-1\right) \Theta \left( 1-x_{{\bar{q}}}\right) \,. \end{aligned}$$
(8)

Here, \(\simeq \) means equal up to order-1 factors. This change of variables decouples the energy fractions and splitting angles to leading power, and exposes the collinear divergence of the matrix element, rendering SD flavour with the Cambridge/Aachen algorithm IRC unsafe at NNLO. This IRC unsafe argument extends to SD with general \(k_T\) reclustering because the constraint on the orderings of the branches is homogeneous in the energy fraction of the soft quarks and so the rescaling of Eq. (7) does not dominantly change the branching structure.

3 Soft drop flavour with reclustering

The key issue with the IRC safety of SD flavour was due to the features of \(k_T\) reclustering. As observed in [7], the \(k_T\) class of algorithms does not favor clustering two soft particles together first, if there is a hard particle around at smaller angle. However, if the soft quark–anti-quark pair were clustered together first, then SD would simply groom them away, which would produce no effect on the jet flavour as simply defined from the hard quark Q. Therefore, we will modify the SD grooming procedure to ensure that the softest pair of particles is clustered first. This can be accomplished through NNLO with the algorithm [17, 18].

Our procedure for achieving an IRC safe definition of jet flavour through at least NNLO accuracy of an arbitrary collection of particles in a pre-defined jet is as follows. We express the procedure in phase space coordinates appropriate for jets produced in \(e^+e^-\) collisions and for jets in hadron collisions, one exchanges energies for momentum transverse to the beam and angles for longitudinal boost-invariant angles.

  1. 1.

    Recluster the jet with the algorithm which has a metric \(d_{ij}\) corresponding to the pairwise mass of particles:

    (9)
  2. 2.

    At each stage of the clustering, require that particles i and j pass the SD grooming requirement, where:

    $$\begin{aligned} \frac{\min [E_i,E_j]}{E_i+E_j} > z_{\mathrm{cut}}\left( \frac{\theta _{ij}^2}{R^2} \right) ^{\beta }, \end{aligned}$$
    (10)

    with the initial jet radius R, angular exponent \(\beta > 0\), and energy scale parameter \(0< z_{\mathrm{cut}}< 1/2\).

  3. 3.

    If the stage in the clustering passes the grooming requirement, terminate and return the sum of flavours of particles in the jet. If the grooming requirement fails, then remove the softer of the two branches, and continue to the next stage of the clustering along the harder branch.

3.1 Argument for IRC safety through NNLO

With this new flavour algorithm, we return to the configuration of Fig. 1 and explicitly show that its contribution to the jet flavour is IRC safe. With clustering for SD, this problematic configuration has phase space constraints of the form:

$$\begin{aligned} \Theta _\text {SD}^\text {JADE}&= \Theta (m^2_{Q{\bar{q}}}-m^2_{Qq}) \Theta (m^2_{q{\bar{q}}}-m^2_{Qq}) \nonumber \\&\quad \times \Theta \left( z_q-z_{\mathrm{cut}}\left( \frac{\theta _{Qq}^2}{R^2}\right) ^\beta \right) \Theta \left( z_{\mathrm{cut}}\left( \frac{\theta _{Q{\bar{q}}}^2}{R^2}\right) ^\beta -z_{{\bar{q}}}\right) \,,\nonumber \\ \end{aligned}$$
(11)

where now pairwise particle invariant masses are compared in the first two \(\Theta \) functions. Under the same change of variables as Eq. (7), the mass orderings take a different form where

$$\begin{aligned}&\Theta (m^2_{Q{\bar{q}}}-m^2_{Qq}) \nonumber \\&\quad = \Theta \left( z_Qx_{{\bar{q}}} z_{\mathrm{cut}}\left( \frac{\theta _{Q{\bar{q}}}^2}{R^2}\right) ^\beta \theta ^2_{Q{\bar{q}}}-z_Qx_{q} z_{\mathrm{cut}}\left( \frac{\theta _{Q q}^2}{R^2}\right) ^\beta \theta ^2_{Qq}\right) \nonumber \\&\quad =\Theta \left( x_{{\bar{q}}} \theta _{Q{\bar{q}}}^{2(\beta +1)}-x_{q} \theta _{Qq}^{2(\beta +1)}\right) \,,\end{aligned}$$
(12)
$$\begin{aligned}&\Theta (m^2_{q{\bar{q}}}-m^2_{Qq}) \nonumber \\&\quad = \Theta \left( x_q z_{\mathrm{cut}}\left( \frac{\theta _{Q q}^2}{R^2}\right) ^\beta x_{{\bar{q}}} z_{\mathrm{cut}}\left( \frac{\theta _{Q{\bar{q}}}^2}{R^2}\right) ^\beta \theta ^2_{q{\bar{q}}} \right. \nonumber \\&\qquad \quad \left. -z_Qx_{q} z_{\mathrm{cut}}\left( \frac{\theta _{Q q}^2}{R^2}\right) ^\beta \theta ^2_{Qq}\right) \nonumber \\&\quad =\Theta \left( x_{{\bar{q}}} z_{\mathrm{cut}}\left( \frac{\theta _{Q{\bar{q}}}^2}{R^2}\right) ^\beta \theta ^2_{q{\bar{q}}}-\theta ^2_{Qq}\right) \,. \end{aligned}$$
(13)

In writing these expressions, we are working in the collinear limit for all pairwise masses and assume that the hard quark Q takes (nearly) all of the energy, \(z_Q \rightarrow 1\). In these coordinates, the SD constraint with reclustering becomes:

$$\begin{aligned} \Theta _\text {SD}^\text {JADE}&=\Theta \left( x_{{\bar{q}}} \theta _{Q{\bar{q}}}^{2(\beta +1)}-x_{q} \theta _{Qq}^{2(\beta +1)}\right) \nonumber \\&\quad \Theta \left( x_{{\bar{q}}} z_{\mathrm{cut}}\left( \frac{\theta _{Q{\bar{q}}}^2}{R^2}\right) ^\beta \theta ^2_{q{\bar{q}}}-\theta ^2_{Qq}\right) \nonumber \\&\quad \times \Theta \left( x_q-1\right) \Theta \left( 1-x_{{\bar{q}}}\right) . \end{aligned}$$
(14)

Now we see immediately that the ordering of emissions that imposes regulates the divergent regions. For example, if the anti-quark \({\bar{q}}\) that fails SD becomes arbitrarily soft, \(x_{{\bar{q}}}\rightarrow 0\), the constraint that \(m_{q{\bar{q}}}^2 > m_{Qq}^2\) fails. Instead, we could take the collinear limit, where all angles \(\theta ^2\rightarrow 0\) at a similar rate. However, with \(\beta > 0\), we observe again that the constraint \(m_{q{\bar{q}}}^2 > m_{Qq}^2\) fails. Finally, we can consider a correlated soft/collinear limit such that the constraint \(m_{q{\bar{q}}}^2 > m_{Qq}^2\) is satisfied. We can isolate this limit by introducing the scaling parameter \(\lambda >0\) and require that

$$\begin{aligned} x_{{\bar{q}}}\rightarrow \lambda x_{{\bar{q}}}, \quad \theta ^2 \rightarrow \lambda ^{-\frac{1}{\beta }} \theta ^2, \end{aligned}$$
(15)

for any pairwise angle \(\theta ^2\). This scaling preserves the constraint that

$$\begin{aligned} x_{{\bar{q}}} z_{\mathrm{cut}}\left( \frac{\theta _{Q{\bar{q}}}^2}{R^2}\right) ^\beta \theta ^2_{q{\bar{q}}}>\theta ^2_{Qq}\,, \end{aligned}$$
(16)

by construction. However, the constraint that \(m^2_{Q{\bar{q}}}>m^2_{Qq}\) is rescaled to

$$\begin{aligned} x_{{\bar{q}}} \theta _{Q{\bar{q}}}^{2(\beta +1)}>x_{q} \theta _{Qq}^{2(\beta +1)}\qquad \rightarrow \qquad \lambda x_{{\bar{q}}} \theta _{Q{\bar{q}}}^{2(\beta +1)}>x_{q} \theta _{Qq}^{2(\beta +1)}\,, \end{aligned}$$
(17)

which is clearly violated for sufficiently small \(\lambda \). Therefore, the jet flavour defined as the sum of flavours that remain in a jet after SD with reclustering is IRC safe, through NNLO.

However, we do not expect this jet flavour definition to be IRC safe at higher perturbative orders. We illustrate one configuration at next-to-next-to-next-to-leading order (NNNLO) in Fig. 2 that demonstrates the problem. The jet boundary is illustrated by the dashed oval, and the particles in the jet consist of a hard quark Q, a hard gluon g, and a soft quark q. The partner soft anti-quark \({\bar{q}}\) is not clustered into the jet. We assume that the hard quark and gluon are sufficiently collinear and have the largest pairwise mass, and are therefore de-clustered first with . With \(\beta > 0\), collinear particles always pass SD, and so the soft quark q is not groomed and necessarily remains in the jet. This remains true for an arbitrarily low energy of the quark, and so this definition of jet flavour will not be IRC safe at NNNLO.

However, several modifications to SD have been proposed that may solve this IRC unsafety issue. In particular, techniques that continue to apply soft drop after the first emission passes, e.g., Refs. [23,24,25], may eliminate flavour ambiguities at higher orders when combined with reclustering. We leave a detailed study of this possibility and the necessary features of such a groomer to future work.

Fig. 2
figure 2

Illustration of a configuration of particles that renders SD flavour with reclustering IRC unsafe at NNNLO. The jet boundary is illustrated in the dashed oval, with the hard quark Q and hard gluon g with the largest pairwise mass and pass SD. An arbitrarily soft quark q lands in the jet and is therefore never groomed away, rendering the jet flavour ambiguous

4 Numerical results

We now perform numerical tests to validate the IRC safety of jet flavour from SD with reclustering. We consider jet production in \(e^+ e^-\) collisions, with jets defined using the Durham clustering algorithm [21] with resolution parameter \(y_\text {cut}\), see Appendix A.2 for details. Two exclusive jets are found, and we determine the flavour in each jet separately. We will perform the tests with SD parameters \(\beta =2\) and \(z_{\mathrm{cut}}=0.1\). Following BSZ, we now introduce the 3-jet resolution parameter \(y_3\), i.e. the maximum value of \(y_\text {cut}\) for which the event has 3 jets. We perform the calculations within the [20] framework. A version of the BSZ algorithm for flavour identification has been implemented and tested in [26,27,28]. We modified this to include our flavour definition, based on the [29] implementation of SD grooming. The fixed order matrix elements are calculated using [30], and one loop virtual corrections are obtained from [31], relying on the [32] library. Infrared divergences between the two are regularised using the Catani–Seymour subtraction method [33, 34]. For concreteness, we perform all calculations at a center of mass energy corresponding approximately to the Z pole, \(\sqrt{s} = 91.2~\text {GeV}\), and set the renormalisation scale \(\mu _R\) to that value.

We can use \(y_3\) as a slicing parameter and write the inclusive NNLO cross section as the sum of two contributions, above and below the cut:

$$\begin{aligned} \sigma ^\text {NNLO}= \int _0^{y_3} d y_3'\, \frac{d \sigma }{d y_{3}'}+\int _{y_3}^{y_\text {max}} d y_3' \, \frac{d \sigma }{d y_{3}'}, \end{aligned}$$
(18)

where in the first contribution we would have to include the 2-loop virtual corrections, while the second has an extra emission and so can be evaluated at 1-loop. In order to establish IRC safety it is enough to study the behaviour of the NLO distribution \(\frac{d \sigma }{d y_3}\) at small \(y_3\). We will find logarithmic divergences, but if IRC safety holds these are all cancelled by the below-the-cut contribution, i.e. the first term in Eq. (18). In order for this cancellation to take place, the flavour assignment of the two contributions must coincide in the singular \(y_3\rightarrow 0\) limit. In turn, this implies that all flavour assignments that do not exist at Born level give a vanishing contribution to \(y_3 \frac{d \sigma }{d y_3}\), in the \(y_3 \rightarrow 0\) limit.

Let us first perform the test for the \({\mathcal {O}}(\alpha _{s})\) contribution, i.e., the lowest order real correction to inclusive jet production. This is shown in the left of Fig. 3. As in the case of the flavour of either plain Durham jets or BSZ jets, there are hard configurations where the quark and anti-quark are clustered together, and hence the event is identified as having two gluon flavour jets. This contribution vanishes for any flavour definition as \(y_3\rightarrow 0\) because there are no singularities associated to clustering the hard quark–anti-quark together. In addition, since SD can remove flavoured objects from the event, we now also have contributions where one jet is identified as gluon and the other as a quark jet. It also vanishes as \(y_3\rightarrow 0\) for finite \(\beta \). However, for \(\beta =0\) it does not, which we additionally illustrate explicitly, consistent with the analysis of Sect. 2.1.

Fig. 3
figure 3

NLO (left) and NNLO (right) contributions to the cross section as a function of the \(y_3\) jet resolution of the event, for different assignments of flavour to the two jets obtained from Durham clustering, according to the jet constituents after SD grooming with \(\beta = 2\) reclustered with (solid). Two IRC-unsafe flavour definitions are also shown dashed: on left, jet flavour with \(\beta = 0\) SD/mMDT grooming and on right, \(\beta = 2\) SD grooming but with C/A reclustering

Next, we can perform the numerical test for the \({\mathcal {O}}(\alpha _{s}^2)\) contribution. The result is shown in the right hand plot in Fig. 3. The same configurations as at the first order appear. In addition, with the higher multiplicity it is now possible that jets contain multiple flavours. This includes quark–anti-quark pairs of different flavours, as well as multiple quarks without matching anti-quarks. There are configurations where both jets are like this, or one of them could still be identified as a quark. All of them have to vanish, so we here collect them into one common contribution. We again illustrate that this test fails for the last contribution if the jets are reclustered with C/A instead of .

We have performed the same tests for colour singlet gluon production from Higgs decays. The explicit results can be found in Appendix C. They confirm that all channels, apart from the gluon–gluon one in this case, vanish in the soft limit. This numerically validates the results of the previous sections for all possible configurations at NNLO.

5 Conclusions

We introduced a novel definition of jet flavour that is IRC safe through NNLO. No modification to the jet clustering in the event is required, and properties of jet grooming are exploited to ensure IRC safety. However, to ensure that soft quark pairs that can render jet flavour ambiguous are always clustered together and then groomed away, we must recluster the jet with the algorithm. Typically, the SD groomer is reclustered with a \(k_T\)-class algorithm, which is vital for calculability and factorization of observables measured on the groomed jet. The algorithm is known to violate soft-collinear factorization, but here we demonstrate that this flaw is actually a necessary feature for jet flavour that is insensitive to partonic flavour during reclustering.

Our jet flavour definition is limited to NNLO accuracy, which is sufficient for implementation with the highest fixed-order predictions available at present. However, one would desire an all-orders, IRC safe, flavour definition that could be implemented on a jet with an arbitrary number of constituents. Such a jet flavour definition could then potentially be calculated in resummed perturbation theory or for which evolution equations could be derived. We have argued that naive extensions of the algorithm presented here fail at NNNLO and further, the use of reclustering may present obstructions for resummation. Iterative, recursive, or dynamical Soft Drop grooming algorithms [23,24,25] may potentially address all of these theoretical issues in one fell swoop, but may be more challenging for experimental implementation. We leave a more detailed study of the vices and virtues of modifications to this SD definition of jet flavour to future work.

Ultimately, we would want to implement this flavour definition into an NNLO prediction matched to a parton shower, just as the BSZ flavour algorithm was used in [8, 9]. Both the algorithm and SD grooming are natively included in [29], and so can be easily, and universally, implemented in any fixed-order numerical code. In this context, it would be important to assess the size of non-perturbative corrections, which are known to be large for , even though our intuition tells us that they should be somewhat reduced in our case, because is used only to recluster the constituents of an existing jet, and the impact that IRC unsafety beyond NNLO has on this type of theoretical predictions. Further, while we only presented expressions relevant for jets in \(e^+e^-\) collisions, the algorithms can be modified for jets in a hadron collider with simple changes of coordinates. This could then be the first realization of jet predictions on experimentally-preferred flavourless anti-\(k_T\) jets [6] with a theoretically-necessary IRC safe flavour definition. We look forward to exploring the new, flavourful frontiers that this inspires in studying QCD.