Machine learning-based approaches have proven to be powerful even in the context of complex hadronic showers, e.g. when assigning hit energy fractions to a known number of showers [33], when discriminating between neutral and charged energy deposits [51], or when separating noise from the real shower deposits [38]. Moreover, these approaches show excellent software compensation capabilities [52,53,54] for single particles. In this section, it is shown that the object condensation method can also be used to train similar deep neural networks to reconstruct an unknown number of particles directly, using inputs from different detector subsystems. The object condensation approach is compared to a baseline PF algorithm inspired by Ref. [21] with respect to the correct reconstruction of individual particles and cumulative quantities, such as the jet momentum.
As software compensation has been proven to be achievable with deep neural networks, the focus here is the correct identification of individual particles. Therefore, the comparison between the methods is based solely on photons and electrons, hence electromagnetic showers and corresponding tracks. This simplification also mirrors the ideal assumptions of the baseline PF algorithm.
Detector and data set
The data set used in this paper is based on a calorimeter and a simplified tracker, built in GEANT4 [55] and shown in Fig. 3. Since this study is based on electromagnetic objects, the calorimeter only comprises an electromagnetic layer with properties similar to the CMS barrel calorimeter [21, 56]: it is made of a grid of \(16 \times 16\) lead tungstate crystals, each covering an area of \(22 \times 22\, \mathrm{mm}^2\) in x and y and with a length of 23.2 cm in z, corresponding to 26 radiation lengths. The front face of the calorimeter is placed at \(\mathrm{z}=0\). The tracker is approximated by one layer of \(300\, \upmu \mathrm{m}\) silicon sensors placed 50 mm in front of the calorimeter with a total size of \(35.2 \times 35.2\, \mathrm{cm^2}\). With \(64\times 64\) sensors, the tracker granularity is 4 times finer than the calorimeter granularity in each direction.
Electrons and photons are generated at \(\mathrm{z}=-10\, \mathrm{cm}\) with momenta between 1 and 200 GeV in the z direction. Their position in x and y is randomly chosen following a uniform distribution and constraining x and y to be between -14 and 14 cm, such that the showers of the particles are fully contained in the calorimeter.
The track momentum \(p_\text {track}\) is determined by smearing the true particle momentum p(t) with an assumed Gaussian track resolution \(\sigma _T\) of:
$$\begin{aligned} \frac{\sigma _T}{p_\text {track}} = 0.04 \left( \frac{p(t)}{100 \, \mathrm{GeV}} \right) ^2+0.01 \text {,} \end{aligned}$$
(12)
and the track position is inferred from the position of the highest energy hit belonging to each particle in the tracker layer. For the calorimeter the simulated deposited energy is recorded resulting in a resolution \(\sigma _C\) that amounts to [21, 56]:
$$\begin{aligned} \frac{\sigma _C}{E} = \frac{2.8\%}{\sqrt{E/\text {GeV}}} \oplus \frac{12\%}{E/\text {GeV}} \oplus 0.3\% \text {.} \end{aligned}$$
(13)
Since multiple particles are considered in each event, two or more particles might be generated with a distance that is not resolvable given the detector granularity. Here, a resolvable particle is defined as a particle that has the highest energy fraction in at least one of the calorimeter cells or the tracker sensors. If a particle is not resolvable, it is removed, which leads to the same effect as merging both particles to one. The only difference between both approaches is that the maximum energy per particle stays within the considered range between 1 and 200 GeV when removing the overlapping particle and therefore provides a better controlled environment for this study.
Baseline particle flow approach
The baseline PF algorithm that is used here follows closely Ref. [21] and the energy thresholds are identical. However, given the ideal tracking in this study, there are no fake or wrongly reconstructed tracks nor any bremsstrahlung effects in the absence of a magnetic field. Therefore, electron and photon showers in the calorimeters can be treated on the same footing. This simplified PF algorithm consists of four steps: seeding of calorimeter clusters, finding calorimeter clusters, linking of tracks and clusters, and finally creating PF candidates. Each of these steps is detailed in the following together with small adjustments made with respect to Ref. [21] that improve the performance on the studied data set.
Seeds for calorimeter clusters are built from each cell that contains a deposit above 230 MeV. The cell is promoted to a seed if all adjacent 8 cells have lower energy than the seed cell. In addition, any cell with a track within the cell area is considered a seed.
Each seed can lead to a calorimeter cluster. The clusters are determined in the same iterative analytic likelihood maximisation detailed in Ref. [21]. Only energy deposits above 80 MeV are considered for the clustering. The cluster position and energy are determined simultaneously for all clusters assuming a Gaussian energy distribution in x and y for each cluster with a width of 15 mm. The iterative procedure is repeated until the maximum difference in position from one iteration to the next iteration is below 0.2 mm. This clustered energy does not correspond directly to the true energy, in particular at lower energies. This bias is corrected by deriving correction factors in steps of one GeV using 100,000 single photon events, calibrating the clustering response to unity.
The linking step is different with respect to Ref. [21]. Since each track in this data set corresponds to a truth particle, and each track leaves a calorimeter deposit, the linking is performed starting from the tracks. Each track is linked to the calorimeter cluster that is closest in the (x,y) plane if the distance is not larger than the calorimeter cell size. This way, more than one track can be linked to one calorimeter cluster. This ambiguity is resolved when building the PF candidates.
The PF candidates are reconstructed from calorimeter clusters linked to tracks. If no track is linked to the cluster, a photon is built. If a track is linked to the cluster and the track momentum and the calibrated cluster energy are compatible within one sigma (\(\sigma _T \oplus \sigma _C\)), the track momentum and cluster energy are combined using a weighted mean, and the particle position is determined from a weighted mean of track and cluster position. In the case where the cluster energy exceeds the track momentum significantly, a candidate is built using the track information only, and the track momentum is subtracted from the cluster energy. The remaining energy produces a photon if there are no more tracks linked to the cluster and its energy exceeds 500 MeV. In case of additional linked tracks, this procedure is repeated until either no cluster energy is left or a final photon is created.
Neural network model and training
For the object condensation approach, each cell or tracker sensor is assigned to exactly one truth particle or labelled as noise. The sensor is assigned to the truth particle that leaves the largest amount of energy in that sensor. If the energy deposit in a cell or tracker sensor is smaller than 5% of the total true energy deposit of that particle in the subdetector, the sensor hit is labelled as noise. The 200 highest-energy hits are interpreted as vertices in a graph. In consequence, a graph neural network is chosen to predict the momentum and position of each particle alongside the object condensation parameters. After one batch normalisation layer, directly applied to the inputs, which are the energy and position information of each vertex, the neural network architecture consists of six subsequent blocks. In each block, the mean of all features is concatenated to the block input, followed by two dense layers, one batch normalisation layer and another dense layer. The dense layers have 64 nodes each and use ELU activation functions. The output of the dense layers is fed through one GravNet [33] layer. This layer is configured to project the input features to four latent space dimensions and 64 features to be propagated from ten neighbour vertices in the latent space. After aggregation, 128 output filters are applied. This output is then passed on to the next block and simultaneously compressed by one dense layer with 32 nodes and ELU activation before it is added to a list of all block outputs. After 6 blocks, this final list, now with 192 features per vertex, is processed by one dense layer with 64 nodes and ELU activation before the final neural network outputs are predicted.
For training this model, the object condensation loss is used as described in Sect. 2. The minimum charge for clustering is set to \(q_{\text {min}}=0.1\). Instead of predicting the momentum directly, a correction \(c_{E,i}\) with respect to the reconstructed energy \(E_i\) assigned to the vertex is learnt by minimising
$$\begin{aligned} L_{E,i} = \left( \frac{c_E E_i - E_i(t) }{E_i(t)} \right) ^2 \text {.} \end{aligned}$$
(14)
Here, \(E_i(t)\) corresponds to the true energy assigned to vertex i. For the particle position, an offset with respect to the vertex position is predicted in units of mm and trained using a standard mean-squared error loss \(L_{x,i}\) per vertex i.
To determine the final loss L, the individual terms are weighted as:
$$\begin{aligned} L = L_\beta + L_V + 20 \cdot L_E + 0.01 \cdot L_x \text {,} \end{aligned}$$
(15)
where \({L}_E\) and \({L}_x\) are the \(\beta _i\) weighted sums of the loss terms \({L}_{E,i}\) and \({L}_{x,i}\) following Eq. 8.
The data set for training contains 1–9 particles per event, out of which 50% are electrons and 50% are photons. In total, 1.5 million events are used for training and 250,000 for validation. The model is trained with TensorFlow, Keras, and the DeepJetCore framework for 20 epochs with a learning rate of \(3\times 10^{-4}\) and for 90 epochs with a learning rate of \(3\cdot 10^{-5}\) using the Adam optimiser. The performance is evaluated on a statistically independent test sample described in the next section. The condensation thresholds are set to \(t_\beta = 0.1\) and \(t_d=0.8\).
Results
The performance of the baseline PF algorithm and the object condensation method are evaluated with respect to single particle quantities and cumulative quantities. For the single particle performance, the reconstructed particles need to be matched to their generated counterpart. For object condensation, this matching is performed by evaluating the truth information associated to the chosen condensation point. While in principle also different points could have been chosen by the network to represent the object properties, the performance suggests that in most cases this matching is successful. For the baseline PF algorithm, electrons can be matched unambiguously using the truth particle associated to the electron track. The matched electrons and the corresponding truth particles are removed when matching the photons in a second step. A more sophisticated matching of truth photons to reconstructed photons is required since the direct connection between energy deposits in cells and the clusters is lost due to the simultaneous likelihood maximisation used to construct the electromagnetic clusters in the baseline PF algorithm, which yields only energies and positions. Therefore, a reconstructed photon is matched to one of the remaining truth photons within a distance of 3 calorimeter cells if it satisfies \(| p(t) - p(r)|/p(t) < 0.9\), with p(t) being the true momentum and p(r) the reconstructed momentum. In case more than one reconstructed candidate satisfying these requirements is found, the one with the closest distance parameter d is chosen, with d being defined as:
$$\begin{aligned} d = \Delta x ^2 + \Delta y^2 + \left[ \frac{22}{0.05} \left( \frac{p(r)}{p(t)} -1 \right) \right] ^2 \text {.} \end{aligned}$$
(16)
Here, \(\Delta x \) and \(\Delta y\) are the differences between truth and reconstructed position in x and y, respectively. The additional factor 22/0.05 scales the momentum compatibility term such that a 5% momentum difference corresponds to a distance of one calorimeter cell. Even though the matching is not strongly affected by small changes in the relative weight of the terms, other values were studied and were found to lead to worse results for the baseline PF algorithm.
Individual particle properties are evaluated on a test data set containing 100,000 particles, distributed into events such, that for each particle, the number of additional particles in the same event is uniformly distributed between 0 and 14. Otherwise the particles are generated in the same way as for the training data set.
The efficiency is defined as the fraction of particles that are reconstructed and truth matched with respect to the number of truth particles that are generated. The fake rate is defined conversely as the fraction of particles that are reconstructed, but without having a truth particle assigned to them. Both quantities are shown as a function of the particle momentum in Fig. 4.
Particularly for higher particle densities per event, the object condensation method shows higher efficiency than the baseline PF algorithm. Also the fake rate is several orders of magnitude lower for the condensation approach, which produces only a small fraction of fakes at very low momenta. For the baseline PF algorithm, having some fakes is intentional, since they ensure local energy conservation in case of wrongly linked tracks and calorimeter clusters.Footnote 3
For each reconstructed and truth matched particle, the energy response is also studied. As shown in Fig. 5, the momentum resolution for individual particles is strongly improved when using object condensation paired with a GravNet based neural network. While the response is comparable for a small number of particles per event, it decreases rapidly for the baseline PF algorithm with higher particle densities.
One of the known strengths of the baseline PF algorithm is its built-in energy conservation, which typically leads to very good performance for cumulative quantities such as when reconstructing the momentum of a whole jet. At the same time, the fact that individual PF candidates are built allows removing those charged particles that are not associated to the primary event vertex, thereby reducing the impact of additional interactions per bunch crossing (pileup). The performance of the object condensation approach and the baseline PF algorithm in such environments is studied using a sample of jet proxies. These jet proxies (referred to as jets in the following) contain only electrons and photons, but have jet-like properties as far as the number of particles and the momentum of the jet constituents are concerned. The jets are generated by randomly picking electrons and photons from an exponentially falling momentum spectrum following \(\exp (-{\ln {(300)} \cdot p(t)/\text {GeV})}\), with the additional constraint of \(1\ \text {GeV}< p(t) < 200\ \text {GeV}\). For each jet, an integer value is chosen between 1 and 15, which determines the expectation value of a Poisson distribution determining the number of particles in the jet. This results in jets with momenta ranging from about 1 GeV up to about 300 GeV. For fixed jet momenta, the constituents follow an exponentially falling momentum spectrum and their number is Poisson distributed.
Within this jet sample, particle multiplicities can be as high as 22 per event while the training sample extends to up to nine particles in each event. In a realistic environment it is very likely that some events do not correspond to the configuration that has been used for training. Therefore the ability of a neural network to extrapolate to such regimes is crucial and strongly influenced by the training method. As shown in Fig. 6, the reconstruction efficiency with GravNet and object condensation extends smoothly well beyond nine particles per event, which is similarly true for other predicted quantities.
The apparent increase in efficiency for the baseline PF algorithm for higher particle multiplicities is likely caused by the fact that the truth matching criteria are not stringent enough to avoid mismatching in the presence of many close-by particles. However, inclusive quantities such as the total jet momentum are not affected by this truth matching. For the purpose of simulating the effect of pileup on the jet momentum, a fraction of charged particles is removed for each jet (referred to as PU fraction in the following). Up to large PU fractions are realistic in the upcoming runs at the Large Hadron Collider. The same particles are removed when determining the true jet momentum as for the reconstructed jet momentum. Since the truth matching of electrons through their track is unambiguous, this procedure does not introduce a bias to the comparison. The true jet momentum \(p_j(t)\) is compared to the reconstructed jet momentum \(p_j(r)\) for well reconstructed jets only. Here, well reconstructed refers to jets fulfilling \(|p_j(r)-p_j(t)|/p_j(t)<0.5\). The remaining jets are labelled as mis-reconstructed. As shown in Fig. 7, the fraction of mis-reconstructed jets increases with larger PU fractions in particular at low \(p_j(t)\), but remains small for the object condensation approach throughout the spectrum and even at a PU fraction of 0.8. Within the sample of well reconstructed jets, the response mean is comparable for object condensation and the baseline PF algorithm at low PU fractions and high momenta, however the differences increase in favour of the object condensation approach for larger PU fractions and lower jet momenta.
While this bias can be corrected a posteriori, the most important metric is the width of the jet momentum resolution distribution, which is here determined as the square-root of the variance for all well reconstructed jets. For zero PU fraction, the built-in energy conservation in the baseline PF algorithm provides the best performance for reasonably high jet momenta and outperforms the object condensation approach. However, once the PU fraction is increased, the identification and correct reconstruction of each individual particle becomes increasingly important, and therefore the object condensation approach in combination with the GravNet-based neural network outperforms the baseline PF algorithm significantly.
The performance difference for single particles and at high PU fractions is particularly noteworthy since the detector configuration and the selection of only electromagnetic objects in principle reflect the more idealistic assumptions made in the baseline PF algorithm. Therefore, more realistic and complex environments, such as in a real particle physics experiment, are likely to increase the discrepancies between the methods in favour of machine-learning based approaches.