Jet flavour tagging for future colliders with fast simulation

Bedeschi, Franco; Gouskos, Loukas; Selvaggi, Michele

doi:10.1140/epjc/s10052-022-10609-1

Jet flavour tagging for future colliders with fast simulation

Regular Article - Experimental Physics
Open access
Published: 26 July 2022

Volume 82, article number 646, (2022)
Cite this article

Download PDF

You have full access to this open access article

The European Physical Journal C Aims and scope Submit manuscript

Jet flavour tagging for future colliders with fast simulation

Download PDF

Franco Bedeschi¹,
Loukas Gouskos² &
Michele Selvaggi²

4086 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

Jet flavour identification algorithms are of paramount importance to maximise the physics potential of future collider experiments. This work describes a novel set of tools allowing for a realistic simulation and reconstruction of particle level observables that are necessary ingredients to jet flavour identification. An algorithm for reconstructing the track parameters and covariance matrix of charged particles for an arbitrary tracking sub-detector geometries has been developed. Additional modules allowing for particle identification using time-of-flight and ionizing energy loss information have been implemented. A jet flavour identification algorithm based on a graph neural network architecture and exploiting all available particle level information has been developed. The impact of different detector design assumptions on the flavour tagging performance is assessed using the FCC-ee IDEA detector prototype.

ATLAS flavour-tagging algorithms for the LHC Run 2 pp collision dataset

Article Open access 31 July 2023

Jet tagging in the Lund plane with graph networks

Article Open access 04 March 2021

Jet classification using high-level features from anatomy of top jets

Article Open access 16 July 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Precision measurements of standard model (SM) parameters are key objectives of the physics program of future lepton and hadron machines [1,2,3,4,5,6]. In particular, the measurement of the Higgs couplings to bottom (b) and charm (c) quarks, and gluons (g) [7,8,9,10,11,12,13], the Higgs self-coupling [14] and the precise characterisation of top quark properties, such as the top quark mass [15] and its electroweak couplings [16, 17] require an efficient reconstruction and identification of hadronic final states. Being able to efficiently identify the flavour of the parton that initiated the formation of a jet, known as jet flavour tagging, is therefore critical for the success of the physics program of future electroweak factories [18]. The large statistics of hadronic Z boson decays ($>10^{11}$) at future lepton and hadron machines would provide copious control samples to calibrate jet tagging algorithms in data.

Jets originating from b and c quark decays contain a b or c hadron that typically travels a macroscopic distance before decaying into lighter hadrons. Compared to b, c, up or down jets (collectively referred to as ud or light jets in what follows), strange quark (s) jets contain a larger fraction of s mesons and baryons. Gluons (g) carry a larger colour charge than quarks and thus tend to produce jets with a large particle multiplicity. Quarks have a harder fragmentation function compared to g, which results in a larger fraction of the jet momentum carried by a smaller fraction of the constituents. Jet flavour tagging algorithms aim at identifying these characteristic end-products of the fragmentation and hadronization of the initial parton.

The first b and c quark tagging algorithms were developed at LEP [19, 20] and the Tevatron [21, 22]. These algorithms typically rely on the detector capability to identify and measure charged tracks with a significant displacement ($c\tau \sim 500~(150)~\upmu {\mathrm {m}}$) from the beam axis originated from long lived B (D) meson weak decays. On the other hand, tracks from the charged hadrons produced in ud quark decays feature a small distance at closest approach to the interaction point. Therefore, in the case of b and c tagging, tracks are typically clustered to reconstruct possible secondary vertices (SVs). However, c tagging is more challenging than b tagging, due to its properties laying between the b and ud or g jets. The track multiplicity and the mass of the SV (expected to be large for heavy flavour jets), together with the presence of a non-isolated electron or muon indicating a semi-leptonic heavy flavour decay, are also used as discriminating variables in traditional heavy quark tagging algorithms. Such taggers, widely used in the early days of current LHC experiments [23,24,25,26] and future $e^+e^-$ experiments [27, 28] are implemented by directly applying a selection on a combination of the tracks and SV properties, by constructing a likelihood ratio or a multi-variate discriminant based on a set of jet-level properties.

Recently, a new generation of advanced machine learning based jet tagging algorithms has been developed [29,30,31,32], bringing more than an order of magnitude improvement in background rejection compared to the traditional approaches in heavy flavour and g tagging. Three are the primary reasons for this success. First, significant advancements in the architecture of the neural networks used, as well as new jet representations that allow to better capture the jet properties have been achieved. Second, these algorithms exploit directly low-level information, e.g., from reconstructed particles (as in the Particle-Flow algorithm [33]) or even reconstructed hits, compared to traditional methods. This allows to explore in much more depth the true potential of the detectors and the event reconstruction, and also better capture the jet properties compared to algorithms relying on jet-level observables. Moreover, the nature of each of the jet constituents, via particle identification techniques (PID), is expected to provide an additional useful handle in discriminating between different jet species. Powerful particle identification capabilities based on ionisation energy loss (via dE/dx or cluster counting), or via precise time-of-flight measurements, are expected to be highly beneficial for jet flavour tagging, in particular for s tagging where the identification of charged kaons is crucial [34,35,36]. Finally, the developments in computing, e.g., graphics processing units, and the availability of very large Monte Carlo simulated and collision data samples, were critical for the development of these advanced methods.

In this paper we present a general framework for building a jet flavour tagging at future colliders using fast detector simulation and state-of-the art machine learning techniques. A major goal of the present work has been to allow for the evaluation of the impact of specific detector design options on the jet flavour tagging performance (and in turn on the physics potential) in an efficient yet precise way. To this end, we have implemented two key additions to the official $\textsc {Delphes}$ fast simulation framework. The $\texttt {TrackCovariance}$ [37], described in Sect. 2, which allows for a simple definition of a tracker geometry, and the fast simulation and the reconstruction of the parameters and covariance matrix of charged particles tracks. The $\texttt {TimeOfFlight}$ [38] and $\texttt {ClusterCounting}$ [39], described in Sect. 3, open up the possibility to model particle identification in $\textsc {Delphes}$. Section 4 describes the input observables and the implementation of the jet flavour identification algorithm. The tagging algorithm is based on $\textsc {ParticleNet}$ [40], using state-of-the-art jet representation and a graph neural network (GNN) architecture. The performance of the algorithm is evaluated using one of the FCC-ee/CEPC baseline detector concepts, the IDEA [1, 41, 42] detector. Variations around the baseline using Higgs decays taken from a Higgsstrahlung sample at $\sqrt{s} = 240$ GeV are discussed. Finally, a discussion of the results, together with limitations of the current approach and perspectives for future work are presented in Sect. 5.

2 Fast tracking simulation

The tracking system is a major part of modern detectors for high energy physics experiments and arguably the most relevant for jet flavour tagging since it is responsible for reconstructing and identifying charged particles. The design of this system, its optimization, and the evaluation of its performance on many specific physics benchmarks is a fundamental step in the planning of future experiments. To this end, we have developed and included in $\textsc {Delphes}$, a versatile and modular framework to easily study different detector configurations, and provide for each of those a fast simulation of the tracking performance. The corresponding module is named $\texttt {TrackCovariance}$ [37]. In this section we present the general implementation of the algorithm, while technical details on the speed optimisation and randomisation can be found in Appendix A1.

While various attempts to calculate the track resolution analytically have been made (see for instance [43]), they usually make highly simplifying assumptions such as equal spacing and equal detector resolution, that make them unsuitable to use for a realistic combined tracking system. The tracking system geometry is described in terms of layers. Only two types of layer geometries are considered: cylinders coaxial with the beam axis and planar disks orthogonal to the beam axis (z-direction). Each layer can be either associated to a measurement with a given resolution or else be just included to describe passive material in the system. An accurate description of the material inside the tracking volume is important to estimate appropriately the contribution of multiple coulomb scattering to the track resolution. Several measurement geometries are allowed: axial or stereo strips and wires, and pixels.

The tracking system is located inside the solenoid magnet generating a constant field, B, directed parallel to the z-direction. With these assumptions, charged tracks follow a helix trajectory that is described with a set of five parameters: $\vec {\alpha }=(D,\, \varphi _0,\,C,\,z_0,\,\lambda )$. These parameters are defined in the point of closest approach (PCA) of the track to the z-axis; D is the signed transverse distance of the PCA from the z-axis, $\varphi _0$ is the track azimutal angle, C is the signed half curvature, $z_0$ the z-coordinate of the PCA and $\lambda $ the cotangent of the track polar angle. Given a charged particle originating at $\vec {x}$ with momentum $\vec {p}$ and charge Q, the parameters $\vec {\alpha }$ are uniquely defined, as is the associated trajectory.

During its motion the charged particle will cross some of the layers described in the geometry and is reconstructed as a track provided that it produces at least 6 hits.^{Footnote 1} At each crossing the particle will undergo small random changes of its direction due to multiple scattering and, in the case of measurement layers, a generalized coordinate, $d^*$, will be measured with an uncertainty given by the specific detector resolution. The track parameters are reconstructed from the measured coordinates by minimizing the following $\chi ^2$ with respect to the track parameters $\vec {\alpha }$. The $\chi ^2$ is defined as:

$$\begin{aligned} \chi ^2 = ~(\vec {d}-\vec {d}^*)^t\,S^{-1}(\vec {d}-\vec {d}^*), \end{aligned}$$

(2.1)

where $\vec {d}^*$ is the array of measured coordinates and $\vec {d}$ that of predicted coordinates, that can be computed from the track parameters $\vec {\alpha }$ and the geometry of each measurement layer. S is the covariance matrix of all the measurements and includes contributions from the detector resolution and from the multiple scattering. The superscript t indicates the transpose of a vector or a matrix. Assuming $\vec {d}_0$ to be the array of predicted coordinates for which the $\chi ^2$ is minimized, then for small variations of the track parameters relative to this minimum, $\delta \vec {\alpha }$, we have:

$$\begin{aligned} \vec {d} \simeq \vec {d}_0+\frac{\partial \vec {d}}{\partial \vec {\alpha }}\delta \vec {\alpha } = \vec {d}_0+A\delta \vec {\alpha }, \end{aligned}$$

(2.2)

where A is the derivative matrix. Including Eq. (2.2) in Eq. (2.1) we obtain:

$$\begin{aligned} \chi ^2 = (\vec {d}_0-\vec {d}^*+A\delta \vec {\alpha })^t\,S^{-1}(\vec {d}_0-\vec {d}^*+A\delta \vec {\alpha }). \end{aligned}$$

(2.3)

Differentiating with respect to the track parameters we obtain the track parameter covariance matrix, C:

$$\begin{aligned} C^{-1}=\frac{1}{2}\frac{\partial ^2 \chi ^2}{\partial \vec {\alpha } \partial \vec {\alpha }} = A^t\,S^{-1}A. \end{aligned}$$

(2.4)

This equation highlights the key ingredients to estimate the track covariance matrix; the derivative matrix and the covariance matrix of the measurements. The former is straightforward and can be derived for every type of measurement given the track equation. The latter requires the combination of two elements: the intrinsic detector resolution and the multiple scattering contribution, as shown in the following equation:

$$\begin{aligned} S_{ij} = \sigma _i^2\,\delta _{ij}+M_{ij}, \end{aligned}$$

(2.5)

where the indices i and j identify the measurement layers, $\sigma _i$ is the detector resolution for layer i and $M_{ij}$ is the multiple scattering contribution. The $M_{ij}$ includes contributions from all scattering layers below the smallest of the two indices, as shown in the following equation:

$$\begin{aligned} M_{ij} = \sum _{1\le k<\text {min}(i,j)} (L_i-L_k)(L_j-L_k) \theta _k^2(i,j), \end{aligned}$$

(2.6)

where $L_i$ is the distance traveled by the track to the layer i and $\theta _k(i,j)$ the standard deviation of the multiple scattering angle generated by layer k after correcting for projection factors specific for layers i and j.

Once A and S have been determined for a given track, the parameter covariance matrix can be computed analytically [44] using Eq. (2.4). The obtained resolution as a function of momentum of the track parameters ($p_{\text {T}}$, D, $z_0$, $\theta $) for two different reference detector configurations proposed for a future $e^+e^-$ collider is shown in Fig. 1. In this case, it clearly appears that the detector transparency is more important than the single point detector resolution, in particular for heavy flavor tagging task, that involve reconstructing and identifying mostly few GeV tracks.

3 Particle identification

Particle identification techniques can play a major role in the identification of the jet flavour. In particular, as will be discussed in Sect. 4, s jets contain a significant fraction of charged kaons ($K^{\pm }$) compared to u or d jets that are mostly composed of charged pions ($\pi ^{\pm }$). Given that the performance of such algorithms heavily depends on explicit detector design choices, it is crucial to be able to first simulate appropriately the detector response and then to implement such particle identification algorithms.

Two complementary particle identification techniques have been included in the $\textsc {Delphes}$ fast-simulation. The precise measurement of the time of arrival of tracks in the outermost part of the tracking volume, together with the momentum and the path length, provide an indirect measurement of the particle mass via the well known time-of-flight method. This method has been implemented in the $\texttt {TimeOfFlight}$ module [38]. The cluster counting method, dN/dx, implemented in the $\texttt {ClusterCounting}$ module [39], consists in counting the multiplicity of the primary ionization clusters produced along the track in gaseous detectors, which together with the particle momentum, can also be used to infer the particle mass. In this section we discuss the implementation of these two methods within the simulation framework.

3.1 Time-of-flight

The time-of-flight ($t_\text {flight}$) of a particle can be expressed as:

$$\begin{aligned} t_\text {flight} \equiv t_\text {F}- t_\text {V} = \frac{L}{\beta } = \frac{L \sqrt{p^2 + m^2}}{p} = \frac{L E}{\sqrt{E^2 - m^2}}, \end{aligned}$$

(3.1)

where $t_\text {F}$ is the measured time after propagation, $t_\text {V}$ is the particle time of production at vertex, L is total path length, and p, E and m are the momentum, energy and mass of the particle, respectively. Provided that the quantities L and p (or E) and $t_\text {V}$ can be measured, the measurement of $t_\text {flight}$ provides an estimate for the particle mass and thus a powerful handle for particle identification.

For charged particles the reconstructed mass is given by:

$$\begin{aligned} m_{\text {t.o.f.}} ^{(c)} = p \sqrt{\left( \frac{t_\text {flight}}{L}\right) ^2 - 1}. \end{aligned}$$

(3.2)

The initial position (and therefore L) and the particle momentum p are reconstructed by means of the inner/outer tracking system, and simulated with the procedure described in Sect. 2. The time of a particle production at vertex $t_\text {V}$ can be estimated indirectly, with the following procedure. Assuming that the beamspot has a small time ($\sigma _\text {B,t}$) and longitudinal ($\sigma _\text {B,z}$) spread compared to the precision of the timing measurement device, the time of the primary vertex can be simply taken as $t_\text {PV} =0$. However, if the particle originates from a highly displaced vertex (e.g. from $K_\text {S}$ or $\Lambda $.), assuming $t_\text {V} =0$ can lead to a severe over-estimate of $t_\text {flight}$. A more accurate estimate for the vertex time corresponds to $t_\text {V} = \frac{r_\text {V}}{\beta _\text {V}}$, where $r_\text {V}$ is the distance of the vertex to the origin and $\beta _\text {V}$ is the vertex velocity, computed from its outgoing particles. In the current study we assume we are able to reconstruct the initial time at vertex perfectly and therefore we take the initial time from Monte Carlo simulation. We define the significance (in number of standard deviations) of $K/\pi $ hypothesis separation as:

$$\begin{aligned} N_\sigma = 2 \; \frac{\mu _\pi - \mu _K}{\sigma _\pi + \sigma _K}. \end{aligned}$$

(3.3)

The time of flight distribution of charged Kaons and Pions emitted at 90$\,^\circ $ is shown in Fig. 2a, assuming a 30 ps timing resolution, which allows for an efficient 3$\sigma $ $K/\pi $ discrimination for momenta $p<3.5$ GeV. For reference, a 3 ps timing resolution leads to 3$\sigma $ separation for momenta $p<10$ GeV.

For neutral particles the mass can be reconstructed from the energy measurement provided by the calorimeters:

$$\begin{aligned} m_{\text {t.o.f.}} ^{(n)} = E \sqrt{ 1 - \left( \frac{L}{t_\text {flight}}\right) ^2}. \end{aligned}$$

(3.4)

At low momenta, where the time-of-flight method is expected to provide good identification capabilities, the calorimetric energy measurement is sub-optimal and leads to poor $m_{\text {t.o.f.}}$ resolution for neutral particles compared to charged particles. Moreover, the vertex time determination is inaccessible for neutral particles. The assumption $t_\text {PV} =0$ for all neutral particles leads to an additional uncertainty on the $m_{\text {t.o.f.}}$ estimate. As an example, the reconstructed $m_{\text {t.o.f.}}$ for $K^{\pm }$, $\pi ^{\pm }$, $K_{L}$, protons and neutrons with momenta $p=1$ GeV, where separation is close to optimal,^{Footnote 2} is shown in Fig. 2b.

3.2 Cluster counting

The cluster counting technique is expected to provide improved particle identification relative to the more commonly used dE/dx methods in large drift chambers or TPCs [45, 46]. In addition it does not require the tuning of truncated mean algorithms to suppress the large Landau tails present in the dE/dx distribution. The number of ionization clusters per unit length is obtained from a very detailed simulation program, $\textsc {Heed++}$ [47], now fully integrated into $\textsc {Garfield++}$ [48]. An array of number of ionization clusters per unit length for several values of $\beta \gamma $ is obtained from $\textsc {Garfield++}$ and used to interpolate the average cluster density. The total mean number of clusters is found by multiplying for the track length in the chamber. Finally the observed cluster number is obtained by extraction over a Poisson distribution with that mean. Four common gas options are available: pure Helium or Argon, He 90% + Isobutane 10%, Argon 50% + Ethane 50%. This library can be easily extended if needed to a larger collection of gas mixtures.

In Fig. 3a the potential for $K/\pi $ separation is shown for a He 90% + Isobutane 10% mixture over a wide range of momenta ($2<p<30$ GeV) . The combination of the cluster counting and time-of-flight techniques is displayed in Fig. 3b and shows an efficient separation of $K^{\pm }$ / $\pi ^{\pm }$ separation ($\ge 3\sigma $) for momenta $p<30$ GeV.

4 Jet flavour identification

In this section a novel jet tagging algorithm is presented. The jet flavour discrimination uses reconstructed observables at the level of the jet constituents. For simplicity, the jet flavour discriminant is built and evaluated using $e^+e^-$ collisions reconstructed with the IDEA detector concept and will thus be referred as $\textsc {ParticleNetIdea}$. While the obtained performance is specific to the clean $e^+e^-$ environment and the explicit detector specifications, the inputs and the construction of the discriminant itself are general. We first discuss the event generation and reconstruction details, then introduce the particle-level input observables and the architecture of the neural network discriminant. Finally we address the tagger performance and its robustness with respect to different detector choices.

Table 1 Set of input variables

Full size table

4.1 Simulated data

The simulated sample consists of $e^+e^- \rightarrow ZH$ events produced at a center of mass energy $\sqrt{s}={240 \hbox {GeV}}$. The Higgs bosons decay to $H \rightarrow g g$ or $H \rightarrow q \bar{q}$, where $q=(u,d), s, c, b$ with relative fraction as expected for a SM Higgs boson with $m=125$ GeV, whereas the Z bosons always decay to a pair of neutrinos. The hard scattering process is generated with $\textsc {MadGraph5}$_aMC@NLO [49], while $\textsc {Pythia8}$ [50] is used for modeling the decay, parton-shower and hadronisation processes. Five different samples, corresponding to each jet flavour category (ud, s, c, b, g) containing $10^6$ events each (or equivalently $2 \times 10^6$ jets) are used for the training. Final state particles are reconstructed with the $\textsc {Delphes}$ PF algorithm. In particular, charged particles are reconstructed using the latest $\texttt {TrackCovariance}$ module described in Sect. 2, and the time-of-flight and number of ionisation clusters per unit length ($dN/dx$), are reconstructed using the $\texttt {TimeOfFlight}$ and $\texttt {ClusterCounting}$ modules, described in Sects. 3.1 and 3.2, respectively. A charged particle is reconstructed provided that it produces at least 6 hits within the tracking volume. Neutral particles (photons and neutral hadrons) are reconstructed by the PF algorithm implemented in the $\texttt {DualReadoutCalorimeter}$ module [51]. The time-of-flight (and corresponding reconstructed mass $m_{\text {t.o.f.}}$) of neutral hadrons is also included and assumes a 100 ps resolution, as opposed to 30 ps assumed for charged particles. The baseline simulation setup assumes the nominal IDEA detector concept [41, 42]. Jets are clustered with the $\textsc {FastJet-3.3.4}$ [52] package using the $e^+e^-$ generalized $k_{\text {T}}$ algorithm [53, 54] with parameter $p=-1$ (for infrared safety) and $R=1.5$ to maximise the energy collected in the jet. This set of parameters leads to an optimal Higgs di-jet invariant mass resolution.

4.2 Input features

The jet constituents in the form of PF candidates are used as inputs to the $\textsc {ParticleNetIdea}$ algorithm. For each PF candidate we define a set of input observables (features) that are summarized in Table 1. The first set of inputs, denoted as kinematics, uses features derived from the 4-momentum of each jet constituent. These include the energy measurement of the constituent relative to the jet energy and the direction of the jet constituents relative to the jet momentum. The second set of features, labelled as displacement, includes observables related to the longitudinal and transverse displacement of the jet constituents which are more relevant to identify jets originating from the hadronization of the b and c quarks. Finally, the third set of inputs, labelled as identification, refers to the nature of each particle using the PF reconstruction and the particle identification (PID) algorithms presented in Sect. 3.

The total number of reconstructed jet constituents, shown in Fig. 4a, is typically larger for g jets compared to quark jets due to their different color factor. We note that the particle multiplicity is shown here for illustrative purposes only as it is not used directly as input to $\textsc {ParticleNetIdea}$ since it is a jet-based variable, while only particle-level observables are used. The remaining distributions of Fig. 4 correspond to particle-level observables and are calculated using the charged constituent with the largest displacement. Figure 4b displays the relative energy of the jet constituent with respect to the jet energy. Gluon jets populate lower values of this observables, indicating that the jet energy is more democratically distributed among the constituents. Figure 4b display observables relevant for b and c quark identification, such as $\text {SIP}_{\text {2D}}$ (left) and its significance $\text {SIP}_{\text {2D}}/ \sigma _{\text {2D}}$ (right) as defined in Table 1. As expected, in b jets, and to smaller extent in c jets, a significantly larger displacement is observed compared to the other jet flavours. Displaced particles can also be present in other jet flavours, e.g. from long-lived $K_{\text {S}}^{0}$ or $\Lambda $ hadrons decays, but represent a much smaller fraction.

4.3 The flavour tagging algorithm

The $\textsc {ParticleNetIdea}$ algorithm is based on the $\textsc {ParticleNet}$ jet tagging algorithm [40]. $\textsc {ParticleNet}$ uses an advanced network architecture, based on Graph Neural Networks (GNN) that first developed in the context of proton–proton collisions at the LHC. A novel jet representation was utilized in $\textsc {ParticleNet}$, where jets are represented as an un-ordered set of particles. As shown in Refs. [40, 55,56,57,58,59,60,61,62,63,64,65,66], this provides a more natural jet representation compared to alternative approaches based on jet images [67,68,69,70,71,72,73,74,75,76] or ordered lists of jet constituents [77,78,79,80,81,82,83,84,85] and translates to an improved tagging performance. A hierarchical learning approach using convolution operations [86] is adopted. Different convolutional layers are used to learn features at different scales: the shallower layers explore local neighborhood information, whereas more global structures are learned by deeper layers. The jet constituents are represented as a graph, where each node of the graph is a jet constituent, and relationships between the particles are the edges of the graph. Each node has a set of features related to constituent properties. However, the graph is not static, rather it is updated after each convolutional operation. The ultimate goal is to group jet constituents according to their proximity in the multi-dimensional space defined by the learned features.

The current $\textsc {ParticleNetIdea}$ implementation uses up to 75 constituents for each jet, sorted by the highest momentum, which typically correspond to more than 99% of the total jet momentum. The algorithm is designed to discriminate between five orthogonal jet classes: ud, s, c, b, and g jets. The training is performed using the $\textsc {Weaver}$ package [87] on 10M jets (2M per category) over 30 epochs on a NVIDIA GTX 1080Ti GPUs. The network outputs 5 real numbers $D_{i}$ ($i = g,\,\ell (ud),\, s,\,c,\,b$) between 0 and 1 (discriminants), one for each jet category. Approximately 1M jets are used to evaluate the $\textsc {ParticleNetIdea}$ performance. For every jet flavour pair (i, j), the binary discriminant is constructed as:

$$\begin{aligned} D_{i,j} = \frac{D_{i}}{D_{i} + D_{j}}, \end{aligned}$$

(4.1)

where $D_{i(j)} $ are the output scores of the classes i and j. For example, $D_{b,c} $ represents the binary discriminant for tagging b quark jets against c quark jets. The efficiency of tagging flavour i as function of the probability of mis-identifying the jet as flavour j (mistag rate) can be constructed by computing the probability of selecting jets that satisfy $D_{i,j} > \alpha $, for $\alpha \in [0,1]$. The receiver operating characteristic (ROC) curve, i.e. the mistag rate as a function of the tagging efficiency (for every $\alpha $), is used as a figure of merit for evaluating the tagger performance for every jet flavour.

4.4 Results

The nominal $\textsc {ParticleNetIdea}$ flavour tagging performance is shown in Fig. 5 for different jet flavours. The b tagging performance is shown in Fig. 5a. The most effective discrimination is observed against ud jets since these contain mostly tracks with no displacement. For high b tagging efficiency, g jet rejection is more effective than c jet (with both being less effective than u, d or s jet rejection). Conversely at small tagging efficiencies (i.e. for high tagging purity), c jet rejection becomes more effective than g jet rejection due to a sizeable probability for g to produce $b \bar{b}$ splittings. For c tagging, at high efficiencies, b jet discrimination is the most effective (due to a large difference of lifetime between B and D mesons), followed by ud and g jet rejection. For large c tagging purity (i.e. at low efficiency and high background rejection), we observe that b jet rejection becomes more challenging than ud jet rejection, which is expected since a fraction of B meson have inherently a comparable decay length to D mesons. We also observe that in this regime $g \rightarrow c \bar{c}$ splittings result into more challenging g rejection. In Fig. 5c the s tagging performance is shown. The most effective discrimination is observed against b jets followed by c jets due to the large displacement of their tracks. The mistag rate against g and ud jets is substantially larger since displacement observables are not discriminating, and the algorithm relies mainly on PID-related variables. Rejection of g jet is more effective than ud jet one since s and ud jets have similar particle multiplicities. Finally, the g tagging performance is displayed in Fig. 5d. Rejection of ud jets is the most challenging, due to similar particle displacement and nature, followed by s, c and b jet rejection.

The modularity of the framework enables the study of the algorithm performance for different detector design choices. In this work, we report two representative examples of possible detector design variations. Figure 6a shows the importance of particle identification information in discriminating s jets from other jet flavours. Exploiting PID information with the nominal $dN/dx$ and $t_\text {flight}$ resolutions yields to approximately an order of magnitude reduced ud jet mistag rate for the same s tagging efficiency. A timing detector providing an improved $t_\text {flight}$ resolution of 3 ps for charged particles, yields a small, but detectable improvement compared to the more realistic scenario of 30 ps. The performance obtained using MC truth information for PID (“ideal PID”) is also shown for reference. In that case only a marginal improvement in performance is observed, suggesting that the existing detector configurations and PID algorithms are very close to optimal. We also note that the improvement brought by neutral particle timing is limited due a much worse nominal timing resolution (100 ps) from the calorimetric timing measurement and most importantly because the contribution coming from neutral massive particles is at most 10% of the total jet energy. We also observe that the usage of PID information brings only modest improvement in other jet flavour tasks.

The distance of the first vertex detector layer to the interaction point is the most important parameter for achieving optimal transverse impact parameter resolution and hence b and c tagging performance. While the nominal IDEA vertex detector provides already an excellent resolution (three layers, with the innermost layer located at 1.5 cm), we study the impact of introducing an additional fourth layer in the pixel detector, closer to the beam pipe, located at 1 cm from the interaction point, on c jet identification. The corresponding performance is displayed in Fig. 6b. The largest improvement is observed in the discrimination against ud jets, where for the same $\epsilon _{\mathrm {S}}$, $\epsilon _{\mathrm {B}}$ is reduced by almost a factor of two. Smaller, yet important improvement, is seen in the discrimination against other jet flavours. The impact of an additional pixel layer was studied for other jet flavours treated as signal without significant improvement in the performance.

5 Conclusion and perspectives

Jet flavour tagging will be a crucial tool for maximising the physics potential at future colliders. This work builds on the design of a fast detector simulation framework, and provides an efficient way to study the impact of different detector design options to the jet flavour tagging problem. A fast tracking module was developed, which allows to easily configure a full tracking geometry including material effects and compute both the charge particle track parameters and the track covariance matrix. Two algorithms that allow for particle identification, the time-of-flight and cluster counting with respectively configurable time resolution and gas composition have also been added. The framework is designed to provide flexibility for further studies, such as the exploration of alternative clustering algorithms, beam energies and final states.

Deep learning techniques based on GNNs have proven very effective for classification problems such as jet flavour tagging and boosted jet tagging at the LHC, and have not been explored yet in the context of future experiments. This paper presents the first algorithm for jet flavour tagging at future $e^+e^-$ colliders using state-of-the-art jet representation and a GNN architecture. At such future machines where statistics for Higgs processes are moderate, flavour taggers will be required to perform well in the high tagging efficiency regime while still providing excellent background rejection. In this study we have investigated the impact of MIP timing resolution and of an additional inner tracking layer on the tagging performance. More studies are possible and should be pursued: the interplay of MIP and calorimeter timing on PID performance, the impact of the tracker design on displaced tracks performance, $K_{S}$ and $\Lambda $ reconstruction and hence on s tagging, secondary vertex reconstruction on b, c and s tagging. Another area for future studies is the calibration of the algorithm. The algorithm is designed to have very little dependence on the jet kinematics and therefore a calibration strategy relying on a Z boson sample of unprecedented statistical power expected to be obtained in $e^+e^-$ experiments seems a promising avenue.

We stress that this study has possible limitations given the inherent optimistic nature of fast simulation. In particular, this tracking simulation include a simplistic particle-matter description where multiple scattering is taken into account to derive track parameter resolutions but no secondary emissions are simulated (i.e. electron bremsstrahlung, photon conversions and hadronic interactions are neglected). A natural next step is to assess the limitation of the fast detector simulation framework by validating the results with events produced using Full Simulation. Nevertheless, the set of tools presented in this article should provide robust means for assessing an upper limit of the achievable tagging performance and the relative performance of alternative detector design choices at future $e^+e^-$ colliders. We also point out that the presented framework should allow for similar optimisations at any future machine, including high energy proton–proton or Muon colliders, acknowledging however that further caution is required due to the lack of simulation of larger background levels.

Data Availability

This manuscript has no associated data or the data will not be deposited. [Authors’ comment: This study is based on large simulated Montecarlo samples and as such the corresponding data cannot be provided easily. However all the necessary information needed to produce the samples is provided in the article.]

Notes

The minimum number of hits for a track to be reconstructed is a configurable parameter in $\textsc {Delphes}$.
We note that in the case of the FCC-ee the typical momenta of charged particles in hadronic Higgs decays are in the range of a few GeV.

References

FCC Collaboration, A. Abada et al., FCC-ee: the lepton collider: future circular collider conceptual design report volume 2. Eur. Phys. J. ST 228, 261 (2019)
CEPC Study Group Collaboration, M. Dong et al., CEPC conceptual design report: volume 2—physics & detector. arXiv:1811.10545 [hep-ex]
The international linear collider technical design report—volume 2: physics. arXiv:1306.6352 [hep-ph]
J. Tian, K. Fujii, Summary of Higgs coupling measurements with staged running of ILC at 250 GeV, 500 GeV and 1 TeV. Technical Report. LC-REP-2013-021, DESY (2013)
CLICdp, CLIC Collaboration, T.K. Charles et al., The compact linear collider (CLIC)—2018 summary report. arXiv:1812.06018 [physics.acc-ph]
M. Benedikt, M. Capeans Garrido, F. Cerutti, B. Goddard, J. Gutleber, J.M. Jimenez et al., Future circular collider study. Volume 3: the hadron collider (FCC-hh). Technical Report. CERN-ACC-2018-0058, CERN, Geneva (2018). https://cds.cern.ch/record/2651300
D.M. Asner et al., ILC Higgs white paper, in Community Summer Study 2013: Snowmass on the Mississippi (2013). arXiv:1310.0763 [hep-ph]
M. Thomson, Model-independent measurement of the e$^{{+}}$ e$^{-}$$\rightarrow $ HZ cross section at a future e$^{{+}}$ e$^{-}$ linear collider using hadronic Z decays. Eur. Phys. J. C 76, 72 (2016). arXiv:1509.02853 [hep-ex]
Article ADS Google Scholar
H. Abramowicz et al., Higgs physics at the CLIC electron–positron linear collider. Eur. Phys. J. C 77, 475 (2017). arXiv:1608.07538 [hep-ex]
Article ADS Google Scholar
J. de Blas et al., Higgs Boson studies at future particle colliders. JHEP 01, 139 (2020). arXiv:1905.03764 [hep-ph]
Article ADS Google Scholar
F. An et al., Precision Higgs physics at the CEPC. Chin. Phys. C 43, 043002 (2019). arXiv:1810.09037 [hep-ex]
Article ADS Google Scholar
L. Borgonovi, S. Braibant, B. Di Micco, E. Fontanesi, P. Harris, C. Helsens et al., Higgs measurements at FCC-hh. Technical Report. CERN-ACC-2018-0045, CERN, Geneva (2018). https://cds.cern.ch/record/2642471
M. Koratzinos et al., TLEP: a high-performance circular $e^+e^-$ collider to study the Higgs boson, in 4th International Particle Accelerator Conference (2013), p. TUPME040. arXiv:1305.6498 [physics.acc-ph]
M.L. Mangano, G. Ortona, M. Selvaggi, Measuring the Higgs self-coupling via Higgs-pair production at a 100 TeV p-p collider. Eur. Phys. J. C 80, 1030 (2020). arXiv:2004.03505 [hep-ph]
Article ADS Google Scholar
K. Seidel, F. Simon, M. Tesar, S. Poss, Top quark mass measurements at and above threshold at CLIC. Eur. Phys. J. C 73, 2530 (2013). arXiv:1303.3758 [hep-ex]
Article ADS Google Scholar
P. Janot, Top-quark electroweak couplings at the FCC-ee. JHEP 04, 182 (2015). arXiv:1503.01325 [hep-ph]
Article ADS Google Scholar
M.L. Mangano, T. Plehn, P. Reimitz, T. Schell, H.-S. Shao, Measuring the top Yukawa coupling at 100 TeV. J. Phys. G 43, 035001 (2016). arXiv:1507.08169 [hep-ph]
Article ADS Google Scholar
P. Azzi, L. Gouskos, M. Selvaggi, F. Simon, Higgs and top physics reconstruction challenges and opportunities at FCC-ee. Eur. Phys. J. Plus 137, 39 (2022). arXiv:2107.05003 [hep-ex]
Article Google Scholar
DELPHI Collaboration, J. Abdallah et al., b tagging in DELPHI at LEP. Eur. Phys. J. C 32, 185 (2004). arXiv:hep-ex/0311003
J. Proriol, A. Falvard, P. Henrard, J. Jousset, B. Brandl, Tagging B quark events in ALEPH with neural networks: comparison of different methods. Int. J. Neural Syst. 3(Supp.), 267 (1991)
Google Scholar
D0 Collaboration, V.M. Abazov et al., $b$-Jet identification in the D0 experiment. Nucl. Instrum. Methods A 620, 490 (2010). arXiv:1002.4224 [hep-ex]
J. Freeman, T. Junk, M. Kirby, Y. Oksuzian, T.J. Phillips, F.D. Snider et al., Introduction to HOBIT, a b-jet identification tagger at the CDF experiment optimized for light Higgs boson searches. Nucl. Instrum. Methods A 697, 64 (2013). arXiv:1205.1812 [hep-ex]
Article ADS Google Scholar
ATLAS Collaboration, Performance of the ATLAS secondary vertex b-tagging algorithm in 7 TeV collision data. Technical Report. ATLAS-CONF-2010-042, CERN, Geneva (2010). https://cds.cern.ch/record/1277682
ATLAS Collaboration, Tracking studies for $b$-tagging with 7 TeV collision data with the ATLAS detector. Technical Report. ATLAS-CONF-2010-070, CERN, Geneva (2010). https://cds.cern.ch/record/1281352
ATLAS Collaboration, Performance of impact parameter-based b-tagging algorithms with the ATLAS detector using proton–proton collisions at $\sqrt{s} = 7$ TeV. Technical Report. ATLAS-CONF-2010-091, CERN, Geneva (2010). https://cds.cern.ch/record/1299106
CMS Collaboration, b-Jet identification in the CMS experiment. Technical Report. CMS-PAS-BTV-11-004, CERN, Geneva (2012). http://cds.cern.ch/record/1427247
M. Battaglia, Jet flavor identification at the CLIC multi TeV e+ e$-$ collider. AIP Conf. Proc. 578, 813 (2001). arXiv:hep-ex/0011099
Article ADS Google Scholar
T. Suehara, T. Tanabe, LCFIPlus: a framework for jet analysis in linear collider studies. Nucl. Instrum. Methods A 808, 109 (2016). arXiv:1506.08371 [physics.ins-det]
Article ADS Google Scholar
Shape ATLAS Collaboration, Performance of b-jet identification in the ATLAS experiment. JINST 11, P04008 (2016)
Article Google Scholar
Shape CMS Collaboration, Identification of heavy, energetic, hadronically decaying particles using machine-learning techniques. J. Instrum. 15, P06005 (2020). arXiv:arXiv:2004.08262
Article Google Scholar
ATLAS Collaboration, Identification of jets containing $b$-hadrons with recurrent neural networks at the ATLAS experiment. Technical Report. ATL-PHYS-PUB-2017-003, CERN, Geneva (2017). https://cds.cern.ch/record/2255226
E. Bols, J. Kieseler, M. Verzetti, M. Stoye, A. Stakia, Jet flavour classification using DeepJet. JINST 15, P12012 (2020). arXiv:2008.10519 [hep-ex]
Article ADS Google Scholar
CMS Collaboration, A.M. Sirunyan et al., Particle-flow reconstruction and global event description with the CMS detector. JINST 12, P10003 (2017). arXiv:1706.04965 [physics.ins-det]
J. Duarte-Campderros, G. Perez, M. Schlaffer, A. Soffer, Probing the Higgs–strange-quark coupling at $e^+e^-$ colliders using light-jet flavor tagging. Phys. Rev. D 101, 115005 (2020). arXiv:1811.09636 [hep-ph]
Article ADS Google Scholar
Y. Nakai, D. Shih, S. Thomas, Strange jet tagging. arXiv:2003.09517 [hep-ph]
SLD Collaboration, K. Abe et al., First direct measurement of the parity violating coupling of the Z0 to the s quark. Phys. Rev. Lett. 85, 5059 (2000). arXiv:hep-ex/0006019
TrackCovariance module in Delphes. https://github.com/delphes/delphes/blob/master/modules/TrackCovariance.cc
TimeOfFlight module in Delphes. https://github.com/delphes/delphes/blob/master/modules/TimeOfFlight.cc
ClusterCounting module in Delphes. https://github.com/delphes/delphes/blob/master/modules/ClusterCounting.cc
H. Qu, L. Gouskos, ParticleNet: jet tagging via particle clouds. Phys. Rev. D 101, 056019 (2020). arXiv:1902.08570 [hep-ph]
Article ADS Google Scholar
F. Bedeschi, A detector concept proposal for a circular $e^+e^-$ collider. PoS ICHEP2020, 819 (2021)
FCC-ee IDEA detector Delphes card. https://github.com/delphes/delphes/blob/master/cards/delphes_card_IDEA.tcl
Z. Drasal, W. Riegler, An extension of the Gluckstern formulae for multiple scattering: analytic expressions for track parameter resolution using optimum weights. Nucl. Instrum. Methods A 910, 127 (2018). arXiv:1805.12014 [physics.ins-det]
Article ADS Google Scholar
F. Bedeschi, Fast tracking simulation. https://indico.cern.ch/event/783429/contributions/3376675/attachments/1829951/3712651/Oxford_April2019_V1.pdf
A.H. Walenta, The time expansion chamber and single ionization cluster measurement. IEEE Trans. Nucl. Sci. 26, 73 (1979)
Article ADS Google Scholar
J.-F. Caron et al., Improved particle identification using cluster counting in a full-length drift chamber prototype. Nucl. Instrum. Methods A 735, 169 (2014). arXiv:1307.8101 [physics.ins-det]
Article ADS Google Scholar
I.B. Smirnov, Modeling of ionization produced by fast charged particles in gases. Nucl. Instrum. Methods A 554, 474 (2005)
Article ADS Google Scholar
R. Veenhof, GARFIELD, recent developments. Nucl. Instrum. Methods A 419, 726 (1998)
Article ADS Google Scholar
J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer et al., The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations. JHEP 07, 079 (2014). arXiv:1405.0301 [hep-ph]
Article ADS MATH Google Scholar
T. Sjöstrand, S. Ask, J.R. Christiansen, R. Corke, N. Desai, P. Ilten et al., An introduction to PYTHIA 8.2. Comput. Phys. Commun. 191, 159 (2015). arXiv:1410.3012 [hep-ph]
DualReadoutCalorimeter module in Delphes. https://github.com/delphes/delphes/blob/master/modules/DualReadoutCalorimeter.cc
M. Cacciari, G.P. Salam, G. Soyez, FastJet user manual. Eur. Phys. J. C72, 1896 (2012). arXiv:1111.6097 [hep-ph]
Article ADS MATH Google Scholar
M. Cacciari, G.P. Salam, G. Soyez, The anti-$k_t$ jet clustering algorithm. JHEP 04, 063 (2008). arXiv:0802.1189 [hep-ph]
Article ADS MATH Google Scholar
S. Catani, Y. Dokshitzer, M. Olsson, G. Turnock, B. Webber, New clustering algorithm for multijet cross sections in ${e^{+}e^{-}}$annihilation. Phys. Lett. B 269, 432 (1991)
Article ADS Google Scholar
V. Mikuni, F. Canelli, ABCNet: an attention-based method for particle tagging. Eur. Phys. J. Plus 135, 463 (2020). arXiv:2001.05311 [physics.data-an]
Article Google Scholar
V. Mikuni, F. Canelli, Point cloud transformers applied to collider physics. Mach. Learn. Sci. Technol. 2, 035027 (2021). arXiv:2102.05073 [physics.data-an]
Article Google Scholar
E.A. Moreno, O. Cerri, J.M. Duarte, H.B. Newman, T.Q. Nguyen, A. Periwal et al., JEDI-net: a jet identification algorithm based on interaction networks. Eur. Phys. J. C 80, 58 (2020). arXiv:1908.05318 [hep-ex]
Article ADS Google Scholar
E.A. Moreno, T.Q. Nguyen, J.-R. Vlimant, O. Cerri, H.B. Newman, A. Periwal et al., Interaction networks for the identification of boosted $H \rightarrow b\overline{b}$ decays. Phys. Rev. D 102, 012010 (2020). arXiv:1909.12285 [hep-ex]
Article ADS Google Scholar
E. Bernreuther, T. Finke, F. Kahlhoefer, M. Krämer, A. Mück, Casting a graph net to catch dark showers. SciPost Phys. 10, 046 (2021). arXiv:2006.08639 [hep-ph]
Article ADS Google Scholar
J. Guo, J. Li, T. Li, R. Zhang, Boosted Higgs boson jet reconstruction via a graph neural network. Phys. Rev. D 103, 116025 (2021). arXiv:2010.05464 [hep-ph]
Article ADS Google Scholar
F.A. Dreyer, H. Qu, Jet tagging in the Lund plane with graph networks. JHEP 03, 052 (2021). arXiv:2012.08526 [hep-ph]
Article ADS Google Scholar
P. Konar, V.S. Ngairangbam, M. Spannowsky, Energy-weighted message passing: an infra-red and collinear safe graph neural network algorithm. arXiv:2109.14636 [hep-ph]
M.J. Dolan, A. Ore, Equivariant energy flow networks for jet tagging. Phys. Rev. D 103, 074022 (2021). arXiv:2012.00964 [hep-ph]
Article ADS Google Scholar
P.T. Komiske, E.M. Metodiev, J. Thaler, Energy flow networks: deep sets for particle jets. JHEP 01, 121 (2019). arXiv:1810.05165 [hep-ph]
Article ADS Google Scholar
H. Serviansky, N. Segol, J. Shlomi, K. Cranmer, E. Gross, H. Maron et al., Set2Graph: learning graphs from sets. arXiv:2002.08772 [cs.LG]
J. Shlomi, S. Ganguly, E. Gross, K. Cranmer, Y. Lipman, H. Serviansky et al., Secondary vertex finding in jets with neural networks. Eur. Phys. J. C 81, 540 (2021). arXiv:2008.02831 [hep-ex]
Article ADS Google Scholar
J. Cogan, M. Kagan, E. Strauss, A. Schwarztman, Jet-images: computer vision inspired techniques for jet tagging. JHEP 02, 118 (2015). arXiv:1407.5675 [hep-ph]
Article ADS Google Scholar
L.G. Almeida, M. Backović, M. Cliche, S.J. Lee, M. Perelstein, Playing tag with ANN: boosted top identification with pattern recognition. JHEP 07, 086 (2015). arXiv:1501.05968 [hep-ph]
Article ADS Google Scholar
L. de Oliveira, M. Kagan, L. Mackey, B. Nachman, A. Schwartzman, Jet-images—deep learning edition. JHEP 07, 069 (2016). arXiv:1511.05190 [hep-ph]
P. Baldi, K. Bauer, C. Eng, P. Sadowski, D. Whiteson, Jet substructure classification in high-energy physics with deep neural networks. Phys. Rev. D 93, 094034 (2016). arXiv:1603.09349 [hep-ex]
Article ADS Google Scholar
J. Lin, M. Freytsis, I. Moult, B. Nachman, Boosting $H\rightarrow b\bar{b}$ with machine learning. JHEP 10, 101 (2018). arXiv:1807.10768 [hep-ph]
Article ADS Google Scholar
J. Barnard, E.N. Dawe, M.J. Dolan, N. Rajcic, Parton shower uncertainties in jet substructure analyses with deep neural networks. Phys. Rev. D 95, 014018 (2017). arXiv:1609.00607 [hep-ph]
Article ADS Google Scholar
P.T. Komiske, E.M. Metodiev, M.D. Schwartz, Deep learning in color: towards automated quark/gluon jet discrimination. JHEP 01, 110 (2017). arXiv:1612.01551 [hep-ph]
Article ADS MATH Google Scholar
G. Kasieczka, T. Plehn, M. Russell, T. Schell, Deep-learning top taggers or the end of QCD? JHEP 05, 006 (2017). arXiv:1701.08784 [hep-ph]
Article ADS Google Scholar
S. Macaluso, D. Shih, Pulling out all the tops with computer vision and deep learning. JHEP 10, 121 (2018). arXiv:1803.00107 [hep-ph]
Article ADS Google Scholar
S. Choi, S.J. Lee, M. Perelstein, Infrared safety of a neural-net top tagging algorithm. JHEP 02, 132 (2019). arXiv:1806.01263 [hep-ph]
Article ADS Google Scholar
D. Guest, J. Collado, P. Baldi, S.-C. Hsu, G. Urban, D. Whiteson, Jet flavor classification in high-energy physics with deep neural networks. Phys. Rev. D 94, 112002 (2016). arXiv:1607.08633 [hep-ex]
Article ADS Google Scholar
J. Pearkes, W. Fedorko, A. Lister, C. Gay, Jet constituents for deep neural network based top quark tagging. arXiv:1704.02124 [hep-ex]
S. Egan, W. Fedorko, A. Lister, J. Pearkes, C. Gay, Long short-term memory (LSTM) networks with jet constituents for boosted top tagging at the LHC. arXiv:1711.09059 [hep-ex]
K. Fraser, M.D. Schwartz, Jet charge and machine learning. JHEP 10, 093 (2018). arXiv:1803.08066 [hep-ph]
Article ADS Google Scholar
A. Butter, G. Kasieczka, T. Plehn, M. Russell, Deep-learned top tagging with a Lorentz layer. SciPost Phys. 5, 028 (2018). arXiv:1707.08966 [hep-ph]
Article ADS Google Scholar
G. Kasieczka, N. Kiefer, T. Plehn, J.M. Thompson, Quark-gluon tagging: machine learning vs detector. SciPost Phys. 6, 069 (2019). arXiv:1812.09223 [hep-ph]
Article ADS Google Scholar
M. Erdmann, E. Geiser, Y. Rath, M. Rieger, Lorentz boost networks: autonomous physics-inspired feature engineering. JINST 14, P06006 (2019). arXiv:1812.09722 [hep-ex]
Article ADS Google Scholar
G. Louppe, K. Cho, C. Becot, K. Cranmer, QCD-aware recursive neural networks for jet physics. JHEP 01, 057 (2019). arXiv:1702.00748 [hep-ph]
Article ADS Google Scholar
T. Cheng, Recursive neural networks in quark/gluon tagging. Comput. Softw. Big Sci. 2, 3 (2018). arXiv:1711.02633 [hep-ph]
Article Google Scholar
Y. Wang, Y. Sun, Z. Liu, S.E. Sarma, M.M. Bronstein, J.M. Solomon, Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. 38 (2019)
H. Qu, Weaver. https://github.com/hqucms/weaver
F. James, Monte Carlo theory and practice. Rep. Prog. Phys. 43, 1145 (1980)
Article ADS Google Scholar

Download references

Acknowledgements

We acknowledge the support from CERN and INFN for this work. In particular, we would like to thank our colleagues from the FCC physics, experiments and detectors group (PED). In particular we thank Patrizia Azzi, Alain Blondel, Patrick Janot, Emmanuel Perez and Gavin Salam for helpful discussions and suggestions. We are also grateful to Sylvie Braibant for the extensive testing of the $\texttt {TrackCovariance}$ module in $\textsc {Delphes}$ and to Pavel Demin and Huilin Qu for the valuable help and support on the $\textsc {Delphes}$ and $\textsc {Weaver}$ frameworks respectively.

Author information

Authors and Affiliations

INFN Sezione di Pisa, Pisa, Italy
Franco Bedeschi
CERN, 1211, Geneva 23, Switzerland
Loukas Gouskos & Michele Selvaggi

Authors

Franco Bedeschi
View author publications
You can also search for this author in PubMed Google Scholar
Loukas Gouskos
View author publications
You can also search for this author in PubMed Google Scholar
Michele Selvaggi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michele Selvaggi.

Appendix A: Tracking speed optimisation and randomisation

1.1 A.1 Speed optimization

The method for the computation of the covariance matrix, presented in Sect. 2, involves the inversion of the covariance matrix of the measurements, S. This matrix can be of order larger than 100 in many practical applications and becomes an obvious speed bottleneck. This problem is solved by generating a set of track covariance matrices in a transverse momentum-polar angle grid during the initialization stage. This grid is loaded in memory and then the track covariance matrices are calculated by means of a bi-linear interpolation over this grid. Care is taken to choose the grid nodes so that the geometry transitions are mapped accurately. In addition, since a linear combination of positive definite symmetric matrices is not always positive definite, we correct for the loss of positive definiteness in the extremely rare cases when this happens. A similar grid mapping the number of measurement points on the track is used to define if the track has a sufficient number of hits to be reconstructed.

This method works well for tracks originating close to the primary interaction point, but clearly cannot describe correctly tracks that start inside the tracking volume such as, for instance, tracks from the decay of long lived particle (such as $K^0_S$ or $\Lambda ^0$). Since these track categories are a small fraction of the total, the full covariance calculation can be used without affecting much the overall simulation speed in these cases.

1.2 A.2 Randomization

Once the “true” track parameters and their covariance matrix have been determined, pseudo-reconstructed tracks are generated by doing a Cholesky decomposition [88] of the covariance matrix. This process transforms a symmetric positive definite matrix, C, in the product of an upper diagonal matrix, U, and its transposed: $C = ~ U^t\,\cdot U$. A vector, $\vec {r}$, of five Gaussian distributed random numbers with mean zero and standard deviation equal to one is generated and used to obtain the resolution smeared track parameters, $\vec {\alpha }'$:

$$\begin{aligned} \vec {\alpha }' = \vec {\alpha } + U^t\,\vec {r}. \end{aligned}$$

(A.1)

It is easily shown that the covariance of the parameters $\vec {\alpha }'$ is exactly C:

$$\begin{aligned}<(\vec {\alpha }'- \vec {\alpha }) (\vec {\alpha }'- \vec {\alpha })^t\,>= U^t\,<\vec {r} \, \vec {r}^t\,>U = U^t\, I U = C \end{aligned}$$

(A.2)

where the angle brackets indicate the average over many random number extractions.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Funded by SCOAP³. SCOAP³ supports the goals of the International Year of Basic Sciences for Sustainable Development.

Reprints and permissions

About this article

Cite this article

Bedeschi, F., Gouskos, L. & Selvaggi, M. Jet flavour tagging for future colliders with fast simulation. Eur. Phys. J. C 82, 646 (2022). https://doi.org/10.1140/epjc/s10052-022-10609-1

Download citation

Received: 23 February 2022
Accepted: 13 July 2022
Published: 26 July 2022
DOI: https://doi.org/10.1140/epjc/s10052-022-10609-1

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Jet flavour tagging for future colliders with fast simulation

Abstract

Similar content being viewed by others

ATLAS flavour-tagging algorithms for the LHC Run 2 pp collision dataset

Jet tagging in the Lund plane with graph networks

Jet classification using high-level features from anatomy of top jets

1 Introduction

2 Fast tracking simulation