Abstract
We present an endtoend reconstruction algorithm to build particle candidates from detector hits in nextgeneration granular calorimeters similar to that foreseen for the highluminosity upgrade of the CMS detector. The algorithm exploits a distanceweighted graph neural network, trained with object condensation, a graph segmentation technique. Through a singleshot approach, the reconstruction task is paired with energy regression. We describe the reconstruction performance in terms of efficiency as well as in terms of energy resolution. In addition, we show the jet reconstruction performance of our method and discuss its inference computational cost. To our knowledge, this work is the firstever example of singleshot calorimetric reconstruction of \({\mathcal {O}}(1000)\) particles in highluminosity conditions with 200 pileup.
Introduction
The highluminosity upgrade of the Large Hadron Collider (HLLHC) will present unprecedented computing challenges [1]. Because the processing complexity of LHC collision scales with the number of hits and energy deposits in the detectors from interacting particles, the computing resource needs will increase significantly as a function of the number of simultaneous protonpair collisions at each particle beam crossing (pileup). In addition, particle detectors with irregular geometries, motivated by a need to combat the high radiation environment of highpileup events, prohibit algorithms that view the detector as a simple grid and encourage more sophisticated approaches.
One of the most problematic computing tasks is the socalled local event reconstruction, i.e., the task of clustering detector hits (energy deposits left by particles on various detector sensors) into particle candidates. Similar to how semantic segmentation works in computing vision, particle reconstruction is more than a clustering task. The reconstruction of a clustered object implies associating it to one particle category (a classification task) and determining the particle energy and flight direction (a regression task). In the future, the enforcement of 5D reconstruction (3D position, energy, and particle time of arrival) both in hardware and software would imply an additional regression task for the time to be considered.
In traditional approaches, particle reconstruction follows a twostep strategy: first, clusters are built, and then classification and regression tasks are performed on these clusters. In some cases, the regression and classification steps employ machine learning algorithms. Boosted decision trees have been used most extensively for this purpose at the LHC. An example of this strategy is discussed in Ref. [2], in the case of the CMS electromagnetic calorimeter.
In LHC protonproton collisions, two bunches of \(\sim 10^{11}\) protons are brought to collision; to increase the probability of rare and interesting interactions (e.g., the production of a Higgs boson) to occur. Because of this, a single collision event contains the particles resulting from more than one collision (the primary particles). These particles travel through the detector components and, when crossing a calorimeter, produce showers of other particles (secondary particles). During Run II, an average LHC collision event consisted of \(\sim 40\) pileup collisions, resulting in \({\mathcal {O}}(1000)\) primary particles. At the HLLHC, up to 200 pileup collisions per event are expected. Traditional event reconstruction algorithms are challenged by the large primaryparticle multiplicity due to the fast scaling of their memory utilization and execution time with the cardinality of the problem. Limiting the increase of computing resource consumption at large pileup is a necessary step for the success of the HLLHC physics program [1]. The use of modern machine learning techniques to perform particle reconstruction has been advocated as a possible solution to this problem [3].
In this paper, we present an endtoend reconstruction algorithm, which takes as input a collection of detector hits in a highly granular calorimeter, similar to that under construction by the CMS collaboration [4], in view of the HLLHC upgrade [5]. The algorithm returns identified particle candidates and their momenta. The algorithm consists of a distanceweighted graph neural network [6, 7], trained using object condensation [8]. At training time, the loss function is built by combining the clustering and regression tasks. In this respect, the presented algorithm is an example of a singleshot graph segmentation model that could find applications beyond the domain of particle physics.
We discuss the algorithm performance in terms of accuracy and computational costs, considering in particular how its performance scales with increasing pileup. We consider both singleparticle response and jet reconstruction to evaluate the reconstruction algorithm both at particle level and with higherlevel objects. To our knowledge, this study represents the first demonstration of endtoend calorimeter reconstruction at high pileup exploiting neural networks.
This paper is organized as follows: Sect. 2 discusses the existing literature related to this work; Sect. 3 describes the detector geometry and the data generation workflow. Section 4 describes the dataset format and its preprocessing. Sections 5 and 6 describe the model architecture and the optimization metric, respectively; Sect. 7 discusses the inference clustering algorithm and how it improves with respect to the original version of object condensation and the computational costs; Sect. 8 shows the physics performance of the algorithm, respectively; conclusions are given in Sect. 9.
Related work
In recent years, GPU acceleration has been investigated as a means to speed up traditional particlereconstruction algorithms by parallelization [9,10,11,12,13,14,15]. In view of these promising results, the LHC experimental collaborations invested financial resources to migrate their computing infrastructure towards CPU+GPU heterogeneous computing [16,17,18]. This transition offers the possibility to exploit neural networks for the same task while benefiting from the impressive technological development in Artificial Intelligence (AI) inference on GPU+CPU heterogeneous platforms. Leveraging this technology trend, we are investigating the possibility of using solutions entirely based on AI algorithms to accomplish calorimetric reconstruction tasks [6, 19, 20]. A similar effort has already been established for particle tracking [21,22,23,24].
Neural networks have also been used in classification and regression tasks on portions of calorimeters [25,26,27]. The use of neural networks for endtoend reconstruction goes beyond calorimeter reconstruction or tracking. Computer vision techniques based on convolutional and graph neural networks have been used for eventtopology classification directly from the energy map of detector hits [28,29,30,31]. Deep neural networks have been exploited as a tool to cluster highlevel objects, e.g., particleflow candidates [32] and jets [33, 34]. All these studies demonstrate that modern AI algorithms give significant accuracy and computational efficiency advantages, maximally benefiting from highly parallelizable hardware. For this reason, and given the popularity of these techniques outside the domain of particle physics, it is natural to assume that AI algorithms will be at the center of future technological developments and, consequently, will play a crucial role in the evolution of the computing model at future collider experiments.
Detector
For this study, we use a dedicated simulation of a highly granular calorimeter based on GEANT4 [35]. The detector geometry is a simplified version of the CMS highgranularity calorimeter (HGCAL) [36], expected to be integrated into the CMS detector for HLLHC.
To describe the detector, we use a righthanded Cartesian coordinate system with the zaxis oriented along the beam axis, the xaxis toward the center of the LHC, and the yaxis oriented upward, as shown in Fig. 1. The x and y axes define the transverse plane, while the zaxis identifies the longitudinal direction. The azimuth angle \(\phi \) is computed with respect to the x axis. Its value is given in radians, in the [\(\pi , \pi \)] range. The polar angle \(\theta \) is measured from the positive zaxis and is used to compute the pseudorapidity \(\eta = \log (\tan (\theta /2))\). The transverse momentum (\(p_T\)) is the projection of the particle momentum on the (x, y) plane. We use natural units such that \(c=\hslash =1\), and we express energy in units of electronvolt (eV) and its prefix multipliers.
Like the CMS HGCAL, our detector is positioned in the endcap region of a cylindershaped \(4\pi \) detector. The endcap calorimeter covers the \(1.5<\eta <3.0\) pseudorapidity range. Its longitudinal and transverse crosssection views are shown in Fig. 2. The detector design is based on a sampling geometry structured in three blocks:

1.
14\(\times \) Electromagnetic layers

2.
12\(\times \) Hadronic layers with thin absorbers

3.
16\(\times \) Hadronic layers with thick absorbers
A schematic representation of the longitudinal sampling structure is shown in Fig. 3, where the layer thickness in z and the materials of which each each layer is composed are outlined. The electromagnetic section of the detector corresponds to 17 radiation and 1.3 nuclear interaction lengths, while the hadronic section corresponds to about 10 nuclear interaction lengths.
Since each electromagnetic layer consists of two siliconsensor planes, there are, in total, 56 silicon layers. When projected on the (\(\eta \),\(\phi \)) plane, the first silicon layer consists of squareshaped \(0.02\times 0.02\) wide sensors. The sensor size linearly increases with the layer depth, reaching \(0.07\times 0.07\) for the last layer. In total, each endcap consists of 778,712 sensors. Due to costrelated considerations, the CMS HGCAL geometry is characterized by hexagonshaped sensors. While being based on simpler squareshaped sensors, our detector has a comparable granularity and it is complex enough to faithfully represent the reconstruction challenges (e.g., in terms of image sparsity, average occupancy, and image resolution) of the HGCAL while making the GEANT4 simulation of the detector more tractable.
Simulation and event generation
Data generation starts with a GEANT4 simulation of individual particles or individual proton collisions that are later combined to form more complex events, e.g., events with pileup. In each simulation, one or more primary particles are produced at the interaction point (0, 0, 0). These particles travel in empty space, since no simulation of an inner tracking detector or magnetic field is part of our simplified setup. Some of the primary particles reach the calorimeter and interact with its material, starting a showering process that creates secondary particles \(P_{\mathrm {secondary}}\). These secondary particles leave energy deposits on the sensors of the silicon layer as hits. Each hit on a sensor is associated to a secondary particle. Four types of interactions are simulated:

1.
Type A: Singleparticle simulations for training. The particles are randomly chosen as \(e^\), \(\gamma \), \(\pi ^{+}\), \(\pi ^{0}\) or \(\tau \), with momentum coordinates uniformly distributed in \(E \in [0.1, 200]\) GeV, \(\eta \in [1.4, 3.1]\), and \(\phi \in [0, 2\pi ]\). \(3.1\cdot 10^5\) simulations are generated that are all used for training.

2.
Type B: Stable singleparticle simulations generated 1 mm away from the detector, as if they are coming from the interaction point in a straight line for testing performance of the models. The particles are randomly chosen as \(e^\), \(\gamma \), or \(\pi ^{+}\), with momentum coordinates uniformly distributed in \(E \in [0.1, 200]\) GeV, \(\eta \in [1.6, 2.9]\), and \(\phi \in [0, 2\pi ]\). 80,000 simulations are generated which are all used for testing.

3.
Minimum bias: Synthetic minimum bias protonproton interactions, generated at a centerofmass energy of \(\sqrt{s}=13\) TeV (as during Run II of the LHC), using PYTHIA8 [37]. \(3.1\cdot 10^5\) and \(2\cdot 10^5\) simulations are generated for training and testing, respectively. These simulations are used to generate pileup for both training and testing.

4.
t\(\bar{\text {t}}\): Synthetic \(q{\overline{q}} \rightarrow t{\overline{t}}\) events generated at \(\sqrt{s}=13\) TeV using PYTHIA8. This sample is used to study the jet reconstruction accuracy. The \(q {\bar{q}}\) production mechanism is selected in order to maximize the fraction of events that produce primary particles in the endcap region. 40, 000 simulations are generated, which are all used for testing.
As a preprocessing step, the raw energy deposit on a sensor returned by GEANT4 is calibrated by rescaling the hit energy according to a multiplicative factor \(c_i\), defined as:
where \(w_s(i)\) is the width in mm of the relevant sensor and \(w_a(i)\) is that of the absorber layer in front of it. Starting from \(c_i\), we then apply a global calibration factor \({\hat{g}}\), which is computed by minimizing the squared difference between the energy of the incoming particle and the deposited hit energy, using a singleparticle calibration dataset \(D_{\mathrm {calib}}\) :
where \(p_S\) labels the unique primary particle in the event S, \(E(p_S)\) – its energy, i  the sensor in the ensemble \({\mathbb {S}}\) of all sensors, and \(d_\mathrm {raw}(i,p_S)\) – the raw energy deposit on i by \(p_S\). The \(D_{\mathrm {calib}}\) sample, a subsample of the training dataset, is defined by requiring all particles to be photons with \(1.8< \eta (p_{S}) < 2.8\), and consists of the 240 singlephoton events with energies between 4 and 200 GeV. Fig. 4 shows the calibration performance.
Several simulations (\({\mathbf {S}}\)) are combined to form a full event as outlined in the algorithm Procedure 1. The ultimate task on this full event is to reconstruct every particle that enters the detector. For this purpose, the groundtruth clusters (T) that the network is trained to learn are built from all the secondary particles (\(P_{\mathrm {secondary}}\)) from the combined simulations and not only from the primary ones. The raw deposits on the sensors from different simulations are added together. In order to emulate realistic detector conditions, detector noise is added to the deposited energy at generatorlevel, as specified in Procedure 1: 6. The detector noise model consists of a generation of spurious energy measurements in the detector sensors, distributed according to a Gaussian probability density function centered at 0 and with a variance of \(5\cdot 10^{6}\). An example of detector noise is shown in Fig. 5 (left).
^{Footnote 1}
All sensors with uncalibrated deposited energy per sensor area (\({\overline{e}}/A\)) greater than \(\rho \) are considered as the reconstructed hits (rechits) in the event (Procedure 1: 12).We choose the constant \(\rho =1.3\cdot 10^{7}\) GeV/mm\(^2\), which corresponds to an uncalibrated energy ranging from \(5.5\cdot 10^{3}\) MeV to 3 MeV between the smallest and the largest sensor. A set of rechits H in the event is given as an input to the network. It is represented as a feature vector \(V_{\mathrm {feat}}(h) = [r, \eta , \phi , x, y, z, \mathrm {A}, e]_h\) defined \(\forall h \in H\), where \(r, \eta , \phi \) (x, y, z) are the boostinvariant cylindrical coordinates (Cartesian coordinates) of the sensor, \(\mathrm {A}\) is its area, and e is the deposited energy. Multiple particles can leave an energy deposit in a single sensor, however the particle that leaves the highest deposit is considered as the true candidate (t(s)) for that hit (Procedure 1: 9).
We define the ground truth as a realistic target for the reconstruction algorithm, applying a selection to the generatorlevel hits in order to take into account their overlap. For instance, when two particles (\(p_1\) and \(p_2\)) are maximally overlapping, we merge them into a single particle in the ground truth since disentangling such two clusters would be an impractical and imprecise task. The \(f_\mathrm {close}(p_1,p_2)\) in Procedure 1: 2 evaluates whether the two showers are maximally overlapping if the following three conditions are met: their incident angles are closer in \(\eta \) than \(1.5 w_{\eta }\), in \(\phi \) than \(1.5 w_{\phi }\), and if the difference in their showering angles is less than 0.09. \(w_{\eta }\) and \(w_{\phi }\) represent the max of width in \(\eta \) and \(\phi \), respectively, of the first sensors that \(p_1\) and \(p_2\) hit. The showering angle is taken from GEANT4 by selecting the momentum direction from when the particles start showering.
The set of true particles (including merged particles) T that have at least one hit assigned to them after the filtering process are taken as the reconstruction target (Procedure 1: 13).
As an example, the 3D view of a generated event is shown on Fig. 5 (right). Detector hits belonging to different incoming primary particles are shown with different colors.
In total, six datasets are generated. Table 1 shows the number of hits and true particles for different datasets as a proxy for event complexity.

1.
Training set: 5000 events, where each event is created from 200 Minimum Bias simulations and 60 Type B simulations. However, due to computational constraints, it is impractical to train with 200 pileup. To overcome this, we augment the data. For each pileup simulation, we randomly picks a point in \(\phi _0 \sim U(0, 2\pi )\) and only select \(P_{\mathrm {secondary}}\) with impact directions between \(\phi _0\) and \(\phi _0+{30}\circ \). The particles originating from Type A simulations are left intact. Figure 6 shows an example event from the training set.

2.
Singleparticle testing set: 20,000 events, where each event is created from Type B simulation only.

3.
PU40+1 testing set: 6800 events, where each event is created from 40 minimum bias simulations and 1 Type B simulation.

4.
PU200+1 testing set: 6800 events, where each event is created from 200 minimum bias simulations and 1 Type B simulation.

5.
PU40+\(t{\overline{t}}\) testing set: 6800 events, where each event is created from 40 minimum bias simulations and 1 \(t{\overline{t}}\) simulation.

6.
PU200+\(t{\overline{t}}\) testing set: 6800 events, where each event is created from 200 Minimum Bias simulations and 1 \(q{\overline{q}} \rightarrow t{\overline{t}}\) simulation.
The training set with 5000 events would require \(10^6\) minimum bias simulations, but we could only generate \(3.2\times 10^5\). Therefore, we randomly sample from the simulation set without replacement. This strategy ensures minimum overlap of pileup between consecutive events and is used for generating all training and testing datasets.
Neural network and training
As discussed in Sect. 4, the input to the neural network is a set of hits. To create the edges needed to construct a graph, we use a dynamic graph neural network approach, GravNet [6]. GravNet dynamically computes the edges with the help of the knearestneighbour algorithm, evaluated in a low dimensional learnable coordinate space. With each application of a GravNet layer, this coordinate space can change, and as a consequence, a different set of neighbours can be assigned to each vertex. Figure 7 visualises the model architecture, which is inspired by Ref. [20] with minor modifications.
Following the object condensation paradigm, the model makes a set of predictions per hit: the threedimensional cluster space coordinates \(x_h\), a condensation score \(\beta _h\), and other object properties, in this case an energy correction factor \(\psi _h \approx 1\). This factor corrects the total deposited calibrated energy in the calorimeter cells, evaluated using truth, to match the impact energy of the particle the hit belongs to. The total calibrated deposited energy left by a particle on all the sensors is defined as the deposited energy of the said particle. In addition, we introduce a distance measure \(\varphi _h\), also per hit, that scales with the expected distance between one shower and hits from other showers or noise. This addition to object condensation allows us to keep the cluster coordinate space low dimensional, and therefore interpretable, while introducing another degree of freedom to adopt distances to locally dense environments. In total, the neural network has a 6dimensional output per hit. These quantities are further explained in Sects. 6 and 7.
The model is trained for 68 epochs using the Adam optimizer [38] on the training set described in Sect. 4, at which point the loss becomes stable and does not further decrease with additional epochs. Each training batch consists of one event.^{Footnote 2} The training is performed within the DeepJetCore framework [39] and the models are implemented in Tensorflow [40] and Keras [41]. We run the training at the Flatiron GPU computing cluster using NVIDIA V100 GPUs.
Loss function
The object condensation method [8] aims at identifying a unique representative hit accumulating all shower properties for each shower, referred to as condensation points. Each hit is also embedded in the clustering space to resolve ambiguities and to assign the remaining hits to showers. The loss consists of three terms: the potential loss (\(L_V\)) is responsible for creating the clustering space and embedding hits in it, the condensation score loss (\(L_{\beta }\)) trains the network to identify the condensation points, and the payload loss \(L_{P}\) creates gradients for the other object properties, in our case the energy correction factor. The relative contribution of the two loss terms is set by a factor \(s_C\), which we take as \(s_C=1\) for this study.
Especially for the hadronic showers, hits that are significantly displaced in position from the shower core are challenging to assign to their initiating incident particle. The misassignment of these hits to showers that are closer in space, but initiated by another incident particle is relatively common and difficult to avoid. Typically such hits are low energy, so their correct shower association is less critical to the task of estimating the total shower properties. With this in mind, we adapt the original object condensation method to reduce the impact of the misassociation of this class of hits in the network. Mathematically, this also serves to reduce the maximum fluctuations in the gradients. In comparison to Ref. [8], we adapt the calculation of the clustering charge \(q_h\), and smoothen the potential terms. The clustering charge is calculated based on \(\beta _h\) with \(0\le \beta _h \le 1\). We rescale the calculation of \(q_h\) slightly to avoid strong gradients for \(\beta _h \rightarrow 1\) as follows:
In addition, a new parameter \(\nu _h\) that describes a spectator weight is introduced. Hits that are scattered far away from the shower core receive a smaller weight, in our case \(\nu _h=0.1\), while all other hits receive a weight of \(\nu _h=1\). To define whether a given hit should be considered a spectator, we first perform a principal component analysis (PCA) on the truthassigned energyweighted hit coordinates of the shower to identify the two principal components which act as the proxies for the shower axes. For this task we define the shower axes in two dimensions only, where the dimensionality is reduced by one due to the shower symmetry. The hits belonging to the shower are then projected onto the defined shower axes. Using the projected coordinates, we compute the Mahalanobis distance [42] for each hit. We consider a hit a spectator if its Mahalanobis distance is larger than 3. For the attractive and repulsive potential losses, the hit \(\alpha _t\) with the highest \(\beta \) score for each truth shower t, also taking into account the spectator weights, plays a special role. It is defined as:
where \(H_t\) is the set of hits belonging to truth shower t. Furthermore, the \(\beta \)weighted average learned distance scale \({\overline{\varphi }}_{t}\) for a truth shower t is calculated as:
Taking the weighted average over the shower as opposed to considering only the hit \(\alpha _t\) has the advantage that it creates a more consistent gradient for \(\varphi _h\) to be learned for every hit. A similar approach is taken for the reference point in clustering space of the potentials that attract or repulse other hits. Here, the reference point for each truth shower t is calculated as
This represents another modification of the original object condensation loss, which takes \(x_{\alpha _t}\) only. The new term in the sum serves to remove noise from the training, while keeping a large impact of the hit \(\alpha _t\), which helps to resolve the degeneracy in the beginning of the training. Based on these ingredients, the attractive potential loss, \(\breve{V}_h\), is then redefined as follows:
where \(w_t\) is the shower weight. For \(E_{\mathrm {true}} > 10\), \(w_t=1\). From 10 to 0.5 GeV, it linearly decreases from 1 to 0. \(\epsilon \) is a small number that is added for numerical stability. The repulsive loss is modified accordingly, as
The full potential loss function takes the form:
Here \(HH_t\) represents the set difference, i.e., all hits that are not assigned to shower t. The payload loss \(L_P\) is also weighed by the object weight \(w_t\) to reduce the impact of low energy showers, that is
with \( \xi (h) = {{\,\mathrm{arctanh}\,}}^2(\beta _h/1.002)\). The energy loss contribution \(L_E\) is calculated as
where \(E_{\mathrm {dep}}\) is the total energy collected in the calibrated calorimeter cells associated to the truth shower t.
The beta loss term consists of two parts and is identical to Ref. [8],
The first term ensures that at least one hit per truth shower is promoted to a condensation point. The second term suppresses noise. \(H_{\circ }\) represents the set of all noise hits. We choose the scaling factor \(s_B = 1\).
Inference
We also extend the inference algorithm from Ref. [8] to reflect the introduction of the local distance scale \(\phi _h\). The algorithm is outlined in Procedure 2 and is applied to the learned clustering space. The algorithm starts with the hit with the highest \(\beta \)score \(\beta _\alpha \) and assigns all hits within a certain radius \(t_d \cdot \phi _\alpha \) to it, with \(t_d=1.0\). These hits are removed for the next iteration. This procedure is repeated until the highest \(\beta \)score is lower than the threshold \(t_\beta = 0.3\). The remaining unassigned hits are considered noise. To determine the energy of the reconstructed cluster, we sum the energy of all hits assigned to a cluster collected around hit \(\alpha \) and multiply this sum by the learned energy correction factor \(\psi _\alpha \).
In Fig. 8, we show inference time and peak GPU memory required for single particle in different pileup conditions. The inference times are evaluated on a Nvidia V100 GPU. For singleparticle events, inference takes about 200 ms. The inference time increases to 1.2 s or 7 s for 40 and 200 pileup, respectively. We expect an additional significant improvement of the overall inference time using edgecontraction methods to reduce the cardinality of the hits. These refinements will be considered in future work. These values should be compared to \({{\mathcal {O}}}(1000)\) s taken by currently adopted algorithms running on CPU, when scaled up to a 200 pileup environment.
Inference in 40 and 200 pileup allocates an average of only 500 MiB and 1300 MiB, respectively, on the GPU. This opens up the possibility that our method can be deployed on machines with less powerful GPUs with smaller VRAM. Note that a larger GPU is required for the training stage as memory can’t be freed up after executing a neural network layer for backpropagationrelated computations.
Here, the final inference algorithm was adapted to only consider closeby hits using a binning approach, making its contribution to the execution time negligible.
Physics performance
We evaluate physics performance in several ways by studying the reconstruction performance of the individual particles and jets. The individual particles, split in electromagnetic particles (\(e^\) and \(\gamma \)) and hadronic particles (\(\pi ^+\)), are studied separately as they exhibit different behaviors. Reconstruction efficiency, energy response, and resolution are studied in different pileup environments, as well as the rate of reconstructed clusters that are either split off from the main shower (unmatched showers). For jets, we investigate the response and resolution in different pileup environments, assuming perparticle pileup removal procedures are in place.
The metrics are studied as a function of the \(p_T\) of the particles and jets. The neural network is regressing only the particles’ energy, but for the computation of their \(p_T\), we use energyweighted mean hit positions to estimate the particles’ direction. For consistency, we also use the same methodology to compute truthlevel \(p_T\).
Figure 9 visually shows the predictions of the neural network and compares them to the truth for both individual particle reconstruction and jet reconstruction.
Particle reconstruction performance
We begin by studying the performance in 0 pileup. These events contain only one probe truth particle and some detector noise. The probe particle is taken from Type B simulations as discussed in Sect. 4. We then overlay the probe particle with 40 and 200 pileup interactions to study performance in a more controlled fashion. While our method reconstructs all the particles in the event, including all the particles from the pileup, we only study the reconstruction performance of the probe particle.
First, we match the probe shower to one of the predicted showers by applying a hitbased matching procedure that we already introduced in Ref. [20]. The procedure calculates energy weighted hitintersection over hitunion score (EIOU) of a reconstructed cluster p and a truth shower \({\hat{t}}\). The predicted shower that results in the highest overlap is taken as the matched shower (\({\hat{p}}\)):
We apply a lower threshold of 0.5 to the EIOU score to study reconstruction efficiency which is shown in Figs. 10a and 11a for electromagnetic and hadronic particles respectively.
The efficiency rises steeply with the increase in \(p_T\) in both electromagnetic and hadronic cases. In 0 pileup, the efficiency reaches a plateau of almost one at \(p_T > 1\) GeV for electromagnetic particles while it remains slightly lower for the hadronic particles with \(p_T < 15\) GeV. As expected, because of the dense environment, the performance drops as the pileup is increased. In 40 pileup, the reconstruction efficiency of the electromagnetic particles deteriorates to around \(80\%\) at 115 GeV with a significant drop for \(p_T < 1\) GeV. For the hadronic particles the reconstruction efficiency drops to around \(70\%\) at 520 GeV.
The efficiency deterioration occurs when the neural network oversplits the showers and these split showers fail to satisfy the matching criterion. Unlike electromagnetic showers, the tendency to oversplit is inherent to the nature of hadrons which is why the hadronic efficiency drops also at high \(p_T\) when pileup is added. Therefore, to study the oversplits, we use energyweighted intersection over minimum (EIOM), defined below:
Unmatched showers are all the predicted clusters with EIOM \(> 0.9\) with the truthlevel probe particle but with EIOU less than 0.5 and these are shown in Fig. 12. The unmatched rate decreases steeply with the predicted \(p_T\). This indicates that low \(p_T\) clusters are split off from higher\(p_T\) showers while most of the energy is reconstructed properly. We note, that by adding tracking information and employing a suitable particle flow algorithm, these splits could be remerged, increasing the efficiency. In addition to oversplitting, it could be possible that the neural network creates fake showers from the noise hits only. However, we observed that the fake rate is close to zero for \(p_T\) above 1 GeV. Additionally, only \(0.5\%\) of the total noise energy is assigned to a predicted cluster on average.
For the matched true showers, we compare the true \(p_T\) and, for the corresponding predicted shower, the regressed \(p_T\). As a baseline, we define a truthassisted \(p_T\) reconstruction that considers the true deposited energy as reconstructed energy which is also then compared to the true shower \(p_T\). In Figs. 10b and 11b, we show the \(p_T\) regression performance scaling with the truth \(p_T\) for electromagnetic and hadronic particles, respectively. To compute the response and resolution, we fit the distribution of the \(p_T\) response, \({p_{T}}_{\mathrm {pred}}/{p_{T}}_{\mathrm {true}}\), with a Gaussian function independently in each \(p_T\) bin. Figures 10c and 11c show the \(p_T\) response distributions and the Gaussian functions for the first four bins. Mean (\(\mu \)) and meancorrected standard deviation (\(\sigma /\mu \)) of the fitted Gaussian functions are taken as the response and resolution. The response can be corrected a posteriori, although it serves as an important metric for the algorithm’s behavior, while the resolution directly reflects the \(p_T\) reconstruction performance.
Due to the nature of hadronic showers and because the cell energies are calibrated on electromagnetic showers, the baseline response for charged pions is below one while it is compatible with one for electromagnetic showers, in particular at high energies. The reconstructed \(p_T\) response provided by the network only differs mildly from the baseline response. The difference to unity response increases by about a factor of two for charged pions if the energy correction factor is not applied (not shown), indicating that the network is capable of distinguishing different shower types.
As expected, the resolution improves with true \(p_T\) and degrades with increase in pileup. Also, here, even the reconstructed hadronic shower resolution is close to the baseline reconstruction and converges to \(\sim 15\%\) above 60 GeV, even in high pileup. The reconstructed energy resolution in 0 pileup for electromagnetic showers is almost indistinguishable from the baseline and, therefore, close to the detector limitations. In 40 pileup, the electromagnetic resolution deviates from the baseline only at low \(p_T\) but approximates the baseline at high \(p_T\) and in 200 pileup, it deviates slightly more.
The neural network was trained on events with hundreds of true showers as shown in Table 1 however it offers an excellent generalization performance on a vastly different datasets, especially where only one particle is present (0 pileup environment). The fact that we do not observe the creation of fakes demonstrates that the neural network has correctly learned to cluster using only local information. This increases confidence in the extrapolation capabilities of the network and training method beyond training conditions in general.
Jet reconstruction performance
Jets and their substructure are an integral ingredient for the analysis of particle collisions. In particular in the forward region, wellresolved individual jet constituents are crucial for a successful pileup removal and for identifying e.g. quark jets over gluon jets in vectorbosonscattering or fusion processes. As modern jet clustering algorithms are infrared and collinear safe, jets also offer a way to gauge the performance of the calorimeter clustering algorithm without strong dependencies of subtleties in the definition of the singleparticle truth. Moreover, oversplitting and overmerging due to the reconstruction process have less impact on the cumulative jet quantities. Generally, a pileup removal algorithm is applied before jet clustering to remove contributions from particles not associated to the primary collision. As discussed in Sect. 4, we generate \(q{\overline{q}} \rightarrow t{\overline{t}}\) events on top of either 40 or 200 minimum bias events, as these events provide significant hadronic activity in the forward region. After the singleparticle reconstruction is performed, a pileup removal algorithm is simulated aided by truth information: We remove all showers that originate from pileup interactions unless they share more than 10% of their energyweighted hits with a reconstructed nonpileup shower. All remaining reconstructed showers are considered for clustering the reconstructed jets. To form the truth jets, we only consider truth showers that stem from the nonpileup interaction. We also define a baseline reconstruction algorithm based on the true deposited energy of the incident nonpileup particles. Jets are then clustered using the antikt algorithm [43] with a distance parameter of \(R=0.4\). Reconstructed and truth jets are matched based on a \(\Delta R = \sqrt{\Delta \eta ^2 + \Delta \phi ^2}\) matching. Among all jets with \(\Delta p_T / {p_T}_\mathrm {true} < 0.5\) and \(\Delta R < 0.3\), we select the best match by minimum \(\Delta R\).
Following the procedure performed for individual particle reconstruction, we fit Gaussian functions to the \(p_T\) response distribution in each \(p_T\) bin. The mean (\(\mu \)) and the meancorrected standard deviation (\(\sigma /\mu \)) of the Gaussian function are taken as jet response and resolution. The distributions and the fitted functions are shown in Fig. 13a, and the response and resolution are shown in Fig. 13a, c for 40 and 200 pileup, respectively.
The response falling below one is a direct consequence of the singleparticle responses shown in Fig. 11b. At higher energies, the resolution starts to approximate the baseline. At lower energies, the presence of pileup degrades the performance slightly, however much less than in the case of single particles. As jets are less affected by truth matching and splitting effects, the assumption that singleparticle performance depends on the matching procedure and the splitting of showers is verified. As shown in Fig. 13c, the performance degrades in 200 pileup compared to 40 pileup.
Conclusion and further research
In this paper, we presented the first demonstration that endtoend reconstruction in high granularity calorimeters using graph neural networks is a feasible method for event reconstruction in very dense particle physics collisions. We perform this task in a singlestep approach, in which the particle hits in the detector are taken as inputs and clustered showers, including their corrected energy, are output with no intermediate steps. Our model has been built using the GravNet graph neural network and the object condensation approach. We have evaluated the performance of our model on both singleparticle events and physics events with pileup, and demonstrated promising performance of energy resolution and response for single particles, as well as for clustered jets. The method shows excellent generalisation properties from singleparticle events to dense jets in 200 pileup.
The proposed method also provides computational advantages, allowing one to exploit GPU acceleration at inference hence reducing the average inference time by at least two orders of magnitude with respect to highpileup projections of the currently employed algorithms designed for CPUs.
Currently, our model does not perform particle identification on the cluster object, which will be further studied in future work. In addition, measurements from additional detector subsystems, such as timing or tracking information, could be included to extend the oneshot reconstruction to a full particle flow prediction. The results presented here pave the way for such extensions and represent an initial demonstration that oneshot inference is an effective and efficient means for reconstruction in dense particle physics collisions.
Data Availability Statement
This manuscript has no associated data or the data will not be deposited. [Authors’ comment: The data used in this paper is very large and it can be reproduced with GEANT4. We are also willing to share the GEANT4 addon software that was used to produce the dataset with individuals upon request.].
Notes
Defines a graph as (nodes, edges)
This choice is governed by the available GPU memory, not due to restrictions of the training framework.
References
J. Albrecht et al., A roadmap for HEP software and computing R &D for the 2020s. Comput. Softw. Big Sci. 3(1), 7 (2019)
A.M. Sirunyan et al., Electron and photon reconstruction and identification with the CMS experiment at the CERN LHC. JINST 16(05), P05014 (2021)
K. Albertsson et al., Machine learning in high energy physics community white paper. J. Phys. Conf. Ser. 1085(2), 022008 (2018)
CMS Collaboration, The Phase2 Upgrade of the CMS Endcap Calorimeter. Technical report (2017)
I. Zurbano Fernandez et al., HighLuminosity Large Hadron Collider (HLLHC): Technical design report. 10/2020 (2020)
S.R. Qasim et al., Learning representations of irregular particledetector geometry with distanceweighted graph networks. Eur. Phys. J. C 79(7), 608 (2019)
Y. Iiyama et al., Distanceweighted graph neural networks on FPGAs for realtime particle reconstruction in high energy physics. Front. Big Data 3, 598927 (2020)
J. Kieseler, Object condensation: onestage gridfree multiobject reconstruction in physics detectors, graph and image data. Eur. Phys. J. C 80(9), 886 (2020)
F. Pantaleo et al., Development of a phaseII track trigger based on GPUs for the CMS experiment, in 2015 IEEE nuclear science symposium and medical imaging conference, p. 7581775 (2016)
D. Rohr et al., GPUaccelerated track reconstruction in the ALICE High Level Trigger. J. Phys. Conf. Ser. 898(3), 032030 (2017)
M. Rovere et al., CLUE: a fast parallel clustering algorithm for high granularity calorimeters in highenergy physics. Front. Big Data 3, 591315 (2020)
R. Aaij et al., Allen: a high level trigger on GPUs for LHCb. Comput. Softw. Big Sci. 4(1), 7 (2020)
A. Bocci et al., Bringing heterogeneity to the CMS software framework. EPJ Web Conf. 245, 05009 (2020)
R. Aaij et al., A comparison of CPU and GPU implementations for the LHCb Experiment Run 3 Trigger. Comput. Softw. Big Sci. 6(1), 1 (2022)
D. Rohr, Usage of GPUs in ALICE online and offline processing during LHC Run 3. EPJ Web Conf. 251, 04026 (2021)
LHCb Collaboration, LHCb Upgrade GPU High Level Trigger Technical Design Report. Technical report. CERN, Geneva (2020)
P. Buncic et al., Technical Design Report for the Upgrade of the OnlineOffline Computing System. Technical report (2015)
CMS Collaboration, The Phase2 Upgrade of the CMS Data Acquisition and High Level Trigger. Technical report. CERN, Geneva (2021) (This is the final version of the document, approved by the LHCC)
S. Bhattacharya et al., Gnnbased endtoend reconstruction in the cms phase 2 highgranularity calorimeter (2022). arXiv:2203.01189
S.R. Qasim et al., Multiparticle reconstruction in the high granularity calorimeter using object condensation and graph neural networks, in EPJ Web of Conferences, EDP Sciences, vol. 251, p. 03072 (2021)
X. Ju et al., Graph neural networks for particle reconstruction in high energy physics detectors, in 33rd annual conference on neural information processing systems (2020)
N. Choma et al., Track seeding and labelling with embeddedspace graph neural networks. FERMILABCONF20387PPDQISSCD (2020)
X. Ju et al., Performance of a geometric deep learning pipeline for HLLHC particle tracking. Eur. Phys. J. C 81(10), 876 (2021)
G. DeZoort et al., Charged particle tracking via edgeclassifying interaction networks. Comput. Softw. Big Sci. 5(1), 26 (2021)
L. De Oliveira, B. Nachman, M. Paganini, Electromagnetic showers beyond shower shapes. Nucl. Instrum. Methods A 951, 162879 (2020)
D. Belayneh et al., Calorimetry with deep learning: particle simulation and reconstruction for collider physics. Eur. Phys. J. C 80(7), 688 (2020)
N. Akchurin et al., On the use of neural networks for energy reconstruction in highgranularity calorimeters. J. Instrum. 16(12), P12036 (2021)
M. Andrews et al., Endtoend physics event classification with CMS open data: applying imagebased deep learning to detector data for the direct classification of collision events at the LHC. Comput. Softw. Big Sci. 4(1), 6 (2020)
M. Andrews et al., Endtoend jet classification of quarks and gluons with the CMS Open Data. Nucl. Instrum. Methods A 977, 164304 (2020)
M. Andrews et al., Accelerating endtoend deep learning for particle reconstruction using CMS open data. EPJ Web Conf. 251, 03057 (2021)
S. Thais et al., Graph neural networks in particle physics: implementations, innovations, and challenges (2022). arXiv:2203.12852
J. Pata et al., MLPF: efficient machinelearned particleflow reconstruction using graph neural networks. Eur. Phys. J. C 81(5), 381 (2021)
A.A. Pol et al., Jet single shot detection. EPJ Web Conf. 251, 04027 (2021)
A.A. Pol et al., Lightweight jet reconstruction and identification as an object detection task (2022)
S. Agostinelli et al., Geant4—a simulation toolkit. Nucl. Instrum. Methods Phys. Res. Sect. A 506(3), 250–303 (2003)
CMS Collaboration, The Phase2 Upgrade of the CMS Endcap Calorimeter. Technical report. CERN, Geneva (2017)
T. Sjöstrand et al., An introduction to PYTHIA 8.2. Comput. Phys. Commun. 191, 159–177 (2015)
D.P. Kingma, J. Ba, Adam: a method for stochastic optimization (2014). arXiv:1412.6980
J. Kieseler et al., DeepJetCore (2020)
M. Abadi et al., TensorFlow: largescale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/
F. Chollet et al., Keras (2015). https://keras.io
P.C. Mahalanobis, On the generalized distance in statistics. National Institute of Science of India (1936)
M. Cacciari, G.P. Salam, G. Soyez, The antikt jet clustering algorithm. J. High Energy Phys. 2008(04), 063 (2008)
Acknowledgements
This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant Agreement No. 772369). We thank Ian Fisk and the Flatiron Institute of the Simons Foundation for providing access to the GPU cluster used to perform these studies.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Funded by SCOAP^{3}. SCOAP^{3} supports the goals of the International Year of Basic Sciences for Sustainable Development.
About this article
Cite this article
Qasim, S.R., Chernyavskaya, N., Kieseler, J. et al. Endtoend multiparticle reconstruction in high occupancy imaging calorimeters with graph neural networks. Eur. Phys. J. C 82, 753 (2022). https://doi.org/10.1140/epjc/s10052022106657
Received:
Accepted:
Published:
DOI: https://doi.org/10.1140/epjc/s10052022106657