Photon Reconstruction in the Belle II Calorimeter Using Graph Neural Networks

We present the study of a fuzzy clustering algorithm for the Belle II electromagnetic calorimeter using Graph Neural Networks. We use a realistic detector simulation including simulated beam backgrounds and focus on the reconstruction of both isolated and overlapping photons. We find significant improvements of the energy resolution compared to the currently used reconstruction algorithm for both isolated and overlapping photons of more than 30% for photons with energies E<0.5 GeV and high levels of beam backgrounds. Overall, the GNN reconstruction improves the resolution and reduces the tails of the reconstructed energy distribution and therefore is a promising option for the upcoming high luminosity running of Belle II.


Introduction
The Belle II experiment is located at the highintensity, asymmetric electron-positron-collider SuperKEKB in Tsukuba, Japan.SuperKEKB is colliding 4 GeV positron and 7 GeV electron beams at a center-of-mass energy of around 10.58 GeV to search for rare meson decays and new physics phenomena.Many of these decays include photons in the final state that are reconstructed exclusively in the electromagnetic calorimeter.The experimental program of Belle II targets a significantly increased instantaneous luminosity that ultimately exceeds the predecessor experiment by a factor of 30.This increase in luminosity also leads to a significant increase in beam-induced backgrounds [1].These background processes produce both high-energy particle interactions that could be misidentified as physics signals, but also energy depositions of low-energy particles that degrade the energy resolution of the electromagnetic crystal calorimeter.The electronics signals from the calorimeter are interpreted during a process called reconstruction to determine the properties of particles that created the signals.
In this paper, we describe a fuzzy clustering algorithm based on Graph Neural Networks (GNNs) to reconstruct photons.The term fuzzy clustering [2] refers to the partial assignment of individual calorimeter crystals to several clustering classes.In our case, these are potentially overlapping, different signal photons, but also a beam background class.
The paper is organized as follows: Section 2 gives an overview of related work on Machine Learning for calorimeter reconstruction.Section 3 describes the Belle II electromagnetic calorimeter.The event simulation and details of the beam background simulation are discussed in Section 4. The conventional Belle II reconstruction algorithm and the new GNN algorithm are described in Section 5. We introduce the metrics used to measure the performance of the GNN algorithm in Section 6.The main performance studies and results are discussed in Section 7. We summarize our results in Section 8.

Related work
Machine Learning is widely used in high energy physics for the reconstruction of calorimeter signals both for clustering [3,4], energy regression [5,6], but also particle identification [7,8] and fast simulation [9][10][11].Most of the recent work has been performed in the context of the highgranularity calorimeter (HGCAL) at CMS [12,13].For Belle II, the use of machine learning utilizing the electromagnetic calorimeter is so far limited to image-based particle identification in the barrel [8,14].
GNNs are now widely recognized as one possible solution for irregular geometries in high energy physics [15][16][17].GNN architectures that are able to learn a latent space representation of the detector geometry itself [18,19] are the basis of the work presented in this paper.
Previous work has focused on simplified and idealized detector geometries, often approximated as a regular grid of readout cells expressed as 2D or 3D images.Additionally, the presence of geometry changes and overlaps between barrel and endcap regions, large variations of cell sizes, and the presence of very high spatially non-uniform noise levels induced by beam background energy depositions are neglected.
For a complete list of works in particle physics that utilize machine learning, we refer to the review [20].

The Belle II Electromagnetic Calorimeter
The Belle II detector consists of several subdetectors arranged around the beam pipe in a cylindrical structure that is described in detail in Ref. [21,22].We define the z-axis of the laboratory frame as the central axis of the solenoid.The positive direction is pointing in the direction of the electron beam.The x axis is horizontal and points away from the accelerator center, while the y axis is vertical and points upwards.The longitudinal direction, the transverse plane with azimuthal angle ϕ, and the polar angle θ are defined with respect to the detector's solenoidal axis.
The Belle II electromagnetic calorimeter (ECL) consists of 8736 Thallium-doped CsI (CsI(Tl)) crystals that are grouped in a forward endcap, covering a polar angle 12.4 • < θ < 31.4 • , a barrel, covering a polar angle 32.2 • < θ < 128.7 • , and a backward endcap, covering a polar angle 130.7 • < θ < 155.1 • .The crystals have a trapezoidal geometry with a nominal cross-sectional area of approximately 6×6 cm 2 and a length of 30 cm, providing 16.1 radiation lengths of material.While crystals in the barrel are similar in cross-section and shape, the crystals in the endcaps vary with masses between 4.03 kg and 5.94 kg [23]; crystals in the endcaps also have significantly more passive material in front of the crystals.Each crystal is aligned in the direction of the collision point with a small tilt in polar angle θ to reduce detection inefficiencies from particles passing between two crystals.Crystals in the barrel additionally have a small tilt in azimuthal angle ϕ.The scintillation light produced in the CsI(Tl) crystals is read out by two photodiodes glued to the back of each crystal.After shaping electronics, the waveform is digitized and the crystal energy E crystal rec over baseline and time t crystal rec since trigger time of the energy deposition are reconstructed online using FPGAs [24].Waveforms of crystals with energy depositions above 50 MeV are stored for offline processing to allow for electromagnetic vs. hadronic shower identification through pulse shape discrimination (PSD) [25].Available information from PSD is • the fit type ID of a multi-template fit indicating which of the possible templates provides the best goodness-of-fit, • the respective χ 2 value as an indicator of the goodness-of-fit, • and the ratio of reconstructed hadronic and photon template energies, referred to as PSD hadronic energy ratio in the following.

Data Set
In this work, we use simulated events to train and evaluate the reconstruction algorithms.The detector geometry and interactions of final-state particles with detector materials are simulated using Geant4 [26] combined with a dedicated detector response simulation.Simulated events are reconstructed and analyzed using the Belle II Analysis Software Framework (basf2) [27,28].We simulate isolated photons, with energy 0.1 < E gen < 1.5 GeV, and direction 17 • < θ gen < 150 • and 0 • < ϕ gen < 360 • drawn randomly from independent uniform distributions in E, θ, and ϕ.The generation vertex of the photons is x = 0, y = 0, and z = 0.For events with two overlapping photons, we first draw randomly one photon with independent uniform distributions as outline above.We then simulate a second photon with an angular separation 2.9 < ∆α < 9.7 • drawn randomly from uniform distributions in ∆α and in E. This angular separation covers approximately the distance needed to create two overlapping clusters.These two cases are typical calorimeter signatures in Belle II that describe the majority of photons.We note that the reconstructions of hadrons is a more difficult task not yet covered by our algorithm.
As part of the simulation, we overlay simulated beam background events corresponding to different collision conditions to our signal particles [1,29].The simulated beam backgrounds correspond to an instantaneous luminosity of L beam = 1.06 × 10 34 cm −2 s −1 (called low beam background ), and L beam = 8 × 10 35 cm −2 s −1 (called high beam background ).Those two values approximately correspond to the conditions in 2021, and the expected conditions slightly above the design luminosity, respectively.The spatial distribution of beam backgrounds is asymmetric: They are much higher in the backward endcap than in the forward endcap, and they are slightly higher in the barrel than in the forward endcap.Additional electronics noise per crystal of about 0.35 MeV is included in our simulation as well.
The supervised training and the performance evaluation both use labeled information that relies on matching reconstructed information with the simulated truth information.For each of the four configurations, isolated and overlapping photons with low and high beam backgrounds, we use 1.8 million events for training and 200 000 events for validation.The performance evaluation is carried out on a large number of statistically independent samples simulated with various energies and in different detector regions.
We then study the performance of the GNN clustering algorithm in all four scenarios and compare it to the baseline basf2 reconstruction.Both reconstruction algorithms are described in detail in Sec. 5.

Isolated Photon
To study isolated photons, we use the simulated events with a generated isolated photon only.For each event, we select a region of interest (ROI): We first determine the azimuthal angles of the fourth neighbour on either side of the local maximum (LM), and the polar angles of the fourth neighbours on either direction of the LM.We then include all crystals in that angular range.In the barrel this defines a regular 9 × 9 array of crystals centered around a LM, while in the endcaps this array is not necessarily regular, but can contain a few crystals more or less.The LM is a crystal with at least 10 MeV of reconstructed crystal energy, and energy higher than all its direct eight neighbors.The LM must be the only LM in the ROI, and the matched truth particle must be a simulated photon responsible for at least 20% of the reconstructed crystal energy.Precisely, for the LM we require the ratio ( Here, E γ1,crystal LM dep denotes the truth energy deposition of photon 1 in the LM, and E crystal LM rec the reconstructed crystal energy in the LM.The crystals contained in the ROI are considered for the clustering by the GNN algorithm and significantly extend the 5 × 5 area considered by the baseline algorithm (Sec.5).Furthermore, the ROI represents the area of the local coordinate system later used as an input feature, with the LM as the origin.Figure 1 (top) shows a typical isolated photon event with high beam background.

Overlapping Photons
Two different photons that deposit some of their energy in identical crystals are referred to as overlapping photons.To study overlapping photons, we use the simulated events with two overlapping photons only.We select events that have exactly LM2 .We refer to criteria a)-c) as LM separation criteria since they ensure that the particles form two separate LMs.Additionally, events must meet the overlap criterion: d) each of the two photons must deposit at least 10 MeV energy in shared crystals within a 5 × 5 area around its respective LM. Figure 2 shows the fraction of events accepted by these selections as a function of the simulated opening angle.In the scope of this paper, we additionally require LMs to exclusively originate from simulated particles without additional LMs, e.g. from beam background, in the ROI, that is: e) the two LMs must be the only ones in the ROI and they must be truth-matched to the simulated photons.Finally, we remove rare cases of small truth energy depositions and large backgrounds, by requiring: f) the crystal with the largest truth energy deposition of a photon must be within a 5 × 5 area around its corresponding LM.We then create a ROI centered at the midpoint between the two LMs, calculated using the shortest distance between two LMs projected onto the surface of a sphere.The crystal closest to the

Reconstruction Algorithms
Interactions of energetic photons in the Belle II ECL typically deposit energy in up to 5 × 5 crystals.The task of the clustering reconstruction algorithms is to select a set of crystals that contains all the energy of the incoming photon, but no energy from other particles or from beam background.Low beam background results in approximately 17 % of all crystals in the ECL having significant reconstructed energy E crystal rec ≥ 1 MeV; for high beam backgrounds this number is expected to increase to about 40 %.This increase in the number of crystals to consider in the clustering, adds to the complexity of the reconstruction.

Baseline
The baseline algorithm is designed to provide maximum efficiency for cluster finding, contain all crystals from the incoming particle for particle identification, and select an optimal subset of the cluster crystals that provides the best energy resolution [21].The clustering is performed in three steps.In the first step, all crystals are grouped into a connected set of crystals, so-called connected regions starting with LMs, as defined previously.In an iterative procedure all direct neighbors with energies above 0.5 MeV are added to this LM, and the process is continued if any neighbor itself has energy above 10 MeV.Overlapping connected regions are merged into one.
In the second step, each connected region is split into clusters, one per LM.If there is only one LM in the connected region, up to 21 crystals in a 5 × 5 area excluding corners centered at the local maximum are grouped into a cluster.If there is more than one LM in a connected region, the energy in each crystal of the connected region is assigned a distance-dependent weight and can be shared between different clusters.The distance is calculated from the cluster centroid to each crystal center, where the cluster centroid is updated iteratively using logarithmic energy weights.This process is repeated until all cluster centroids in a connected region are stable within 1 mm.
In a third step, an optimal subset, including the n highest energetic crystals of all non-zero weighted crystals that minimize the energy resolution, is used to predict the cluster energy E basf2 rec .n depends on the measured noise in the event, and on the energy of the LM itself.The noise level is estimated by counting the number of crystals in the event containing more than 5 MeV that have times t more than 125 ns from the trigger time.E basf2 rec is also corrected already within basf2 for possible bias using simulated events.This bias includes leakage (energy not deposited in the crystals included in the energy sum) and beam backgrounds (energy included in the sum that is not from the signal photon).E basf2 rec is the estimator for the generated energy of a particle.
The basf2 clustering algorithm also returns a cluster energy E basf2 rec, raw that is not corrected for energy bias.E basf2 rec, raw is the estimator for the deposited energy of a particle.

Graph Neural Network Architecture
GNN architectures have shown that they are powerful network types to deal with both irregular geometries and varying input sizes.In this work, all crystals of an ROI with an energy deposition above 1 MeV are interpreted as nodes in a graph, which leads to variable input sizes and is thus a good use case for GNNs.The implementation of this GNN is done in PyTorch Geometric [30].
The input features consist of crystal properties and crystal measurements: The global coordinates θ and ϕ of each crystal, the local coordinates θ ′ and ϕ ′ with respect to the ROI center, the crystal mass, and the LM(s) (in one-hot encoding) represent crystal properties.The crystal energy E crystal rec in GeV, the time t crystal rec in µs, and the PSD fit type, PSD χ 2 , and PSD hadronic energy ratio are crystal measurements used as input features.Pre-processing scales the input uniformly before further processing with the GNN: All features are min-max normalized to an interval of [0, 1] with the exception of t crystal rec and the PSD hadronic energy ratio which are both normalized to the interval [−1, 1].The global coordinates and the crystal masses are normalized based on the range of coordinates and masses of all crystals in the detector instead of only the ones in the ROI.Additionally, we average each input feature over all nodes in the ROI and concatenate the averaged input features as additional inputs, thus enabling a global exchange of information.
As displayed in Fig. 3, our model is built out of four so-called GravNet [19] blocks of which the concatenated outputs are passed through three dense output layers with a final softmax activation function.Each GravNet block features three dense layers at the beginning of the block, the initial two of which with ELU [31] activation functions and the last one with a tanh activation function.The dense layers feed into a GravNet layer and the overall GravNet block is concluded by a batch normalization layer [32].The GravNet layer is responsible for the graph building and subsequent message passing between the nodes of the graph.It first translates the input features into two learned representation spaces: one representing spatial information S while the other, denoted F LR , contains the transformed features used for message passing.In the second step, each node is The implementation in the present work follows the concept of fuzzy clustering which refers to the partial assignment of individual crystals to several clustering classes.Consequently, the GNN predicts weights w X i that indicate the proportion of the reconstructed energy E crystali rec in a crystal i that belongs to a clustering class X.For models used with isolated photons, X ∈ {γ 1 , background}, for models with overlapping photons X ∈ {γ 1 , γ 2 , background}.As a loss function, we then use the Mean Squared Error (MSE) between the true and predicted weights summed over all classes and crystals.The training is stopped when there has been no improvement for 15 epochs in the optimization objective.For low beam background models that objective is the MSE loss on the validation data set, whereas the high beam background models employ the more high-level FWHM dep (Sec.6) on the validation data set.
Hyperparameters have been chosen through a hyperparameter optimization using Optuna [33].The optimization is done with respect to the FWHM dep (Sec.6) instead of the loss function.We optimize the two models trained for high  beam backgrounds and use the respective hyperparameters also for the corresponding low beam background models.The final hyperparameters for both the isolated photon models and the overlapping photon models are shown in Table 1.
The learning rate, the number of dense layers in each GravNet block, and all dimensions of the output layers have been manually optimized by testing a reasonable range of values.The learning rate is set to 5 × 10 −3 and is subject to a decay factor of 0.25 after every five epochs of stagnating validation loss.We did not observe significant over-training and as a consequence, we do not use dropout layers or other regularization methods but rely on the large data set.
The GNN algorithm yields the weights w X i per crystal for all crystals in the ROI with an energy deposition above 1 MeV.In order to reconstruct the total cluster energy E GNN rec associated with a certain particle, we then sum over all specific weights multiplied by the reconstructed energies per crystal, E GNN rec = w X i E crystali rec .Figure 4 shows how the GNN and the basf2 algorithms behave in clustering a typical case of overlapping photons.

Metrics
For performance evaluation, the reconstructed energy of a particle is compared with two different truth targets: the total deposited truth energy E dep per photon in the ROI, and the generated truth energy E gen per photon.This results in two variants of relative reconstruction errors.The reconstruction error on the deposited energy gives access to the energy resolution ignoring leakage and other detector effects.It is a direct evaluation of the clustering performance of an algorithm.
On the other hand, the reconstruction error on the generated energy factors in all detector and physics effects and quantifies how much of the improvements to the underlying clustering carry over to downstream physics object reconstruction.
Evaluating both algorithms on a large number of simulated photons yields peaking distributions in both reconstruction errors η dep and η gen .Both distributions are potentially biased because of energy leakage and the presence of beam backgrounds (see Sec. 5.1).We perform a binned fit using a double-sided crystal ball [34,35] function as probability density function (pdf) with the kafe2 [36] framework.We shift all reconstruction error distributions independently by a multiplicative factor to correct the difference between the fitted peak position and zero (Fig. 5).Since η dep and η gen are asymmetric distributions, we repeat this procedure until the difference between the fitted peak position and zero is less than 0.002.This procedure usually converges within two or three iterations.
We then determine the full width half maximum (FWHM) of the final shifted distributions in η dep and η gen , yielding FWHM dep and FWHM gen respectively.The uncertainty on the FWHM is calculated from the uncertainties of the fit parameters.In addition to the FWHM, we determine the tails of the reconstruction error distribution.The left and right tails T L,R are calculated as the 95th percentile when ranking the unbinned events on the respective side of the peak position, as given by the fit parameters, in ascending order (T R ) and descending order (T L ) respectively.Propagating the uncertainty on the peak position as given by the fit yields the uncertainty on T L,R .

Results
The first sections of the results focus on detailed studies of isolated clusters.Section 7.4 then introduces overlapping clusters and their effects on the performance.Figure 6 shows examples for the distributions of both reconstruction errors η dep and η gen , as well as the fit results for events with low beam background.Figure 7 shows the equivalent distributions for events with high beam background.
The η gen distributions are wider because the reconstruction error includes the effects of leakage which result in missing energy with respect to the generated photon energy.This only affects the leftside tails.
In the following subsections, we are comparing the performance of the GNN and the basf2 reconstruction algorithms for different detector regions for low and high beam backgrounds by evaluating the energy resolution FWHM gen /2.355 and the tail parameters.We then analyze the GNN in more detail by testing the input variable   dependencies and the robustness against differences in beam background levels between training and evaluation.

Energy resolution and energy tails
The three detector regions barrel, forward endcap, and backward endcap described in Sec. 3 differ in crystal geometry, levels of background, and amount of passive material before and in between crystals.The following section studies the variations in the energy reconstruction performance that arise as a direct result of these differences.
In order to access the energy dependence of the resolution and tail parameters we simulate test data sets of photons at various fixed energies.The FWHM for each simulated data set is then determined according to Sec. 6. Plotting the resolutions FWHM gen /2.355 over the generated photon energies E gen reveals a characteristic relationship that is parameterized by the function a/E gen ⊕ b/ E gen ⊕ c, where ⊕ indicates addition in quadrature.
Both the GNN as well as the baseline algorithm perform differently in regards to the energy resolution in all three detector parts, as can be seen in Fig. 8a for low beam background and as Fig. 8b for high beam background.Table 2 reports the parameters of the fitted parameterization of the resolution.We attribute these difference to the large spread of both shape and size of crystals in the endcaps, the asymmetric distribution of beam backgrounds, and the different amount of passive material in front of the different detector regions.
Overall, the energy resolution of the GNN algorithm is significantly better than the baseline algorithm for all photon energies.The GNN energy resolution is better by more than 30 % for photon energies below 500 MeV which is the energy range of more than 90 % of all photons in B-meson decay chains.The higher the beam background, the larger the difference between the GNN and the baseline algorithm.The difference between the two algorithms decreases with energy because the relative contribution of beam backgrounds to the photon energy resolution decreases.
The shape of the left-side tails is dominated by passive material and is hence expected to be different in the different detector regions.The left-side tails are almost independent of beam backgrounds as can be seen by comparing Fig. 9a for low beam background and Fig. 9c for high beam background.The GNN and the baseline algorithm both show the smallest tail length for the barrel region with decreasing tail lengths for increasing energy.The left-side tails are largest in the backward endcap due to the highest ratio of passive to active material as expected.The right-side tails are mostly originating from beam background being wrongly added to photon clusters.The GNN produces shorter tails than the baseline algorithm for all energies and for both low and high beam backgrounds, with the performance difference increasing for lower energies and higher beam backgrounds.

Beam Background Robustness
The beam background levels are changing continuously during detector operations.Ideally, reconstruction algorithms at Belle II are insensitive to such changes.The basf2 baseline algorithm achieves robustness against increasing beam backgrounds by adaptively including fewer crystals in the energy sum calculation.Since our GNN is trained with a large number of events with event-by-event fluctuations of beam backgrounds, we expect robustness against varying beam backgrounds if the GNN generalizes well enough.We test the robustness of our GNN by comparing GNNs trained and tested on the same backgrounds, against GNNs trained and tested on the two different beam backgrounds (Fig. 10, parameterization in Tab. 3).While the GNNs trained on the same beam backgrounds achieve a better resolution than the ones trained on different beam backgrounds, the GNN still outperforms the baseline algorithm even for networks trained on the different beam backgrounds.This demonstrates an promising generalization with respect to different levels of beam backgrounds.

Input Parameter Dependency
As discussed in Sec. 3, multiple input features are available for the GNN, while the basf2 algorithm uses crystal position and energy only.This section presents a study of the influence of the input features on the FWHM.For that, the architecture described in Sec.5.2 is trained on isolated photon events with low or high beam backgrounds using different combinations of input features.The 200 000 events from the respective validation data set, as described in Sec. 4, are used for inference.The data set covers an energy range of 0.1 < E gen < 1.5 GeV and the full detector range 17 • < θ gen < 150 • and 0 • < ϕ gen < 360 • , each of which in uniform distribution.The FWHM of E gen and E dep is calculated as described in Sec. 6.All GNNs use the global crystal coordinates, the LM position, and the crystal mass as input features.A comparison of the FWHM for the different additional input features is shown in  Tab. 4. The results show, that even for the minimal set of input variables, the GNN's FWHM is smaller than basf2's for both the deposited and the generated energy in both beam background scenarios.Adding local coordinates leads to small improvements and using time information brings significant improvement in the GNN performance.PSD information has almost no effect on the FWHM.Since the main purpose of the PSD information is to differentiate electromagnetic and hadronic interactions per crystal, this is expected.In anticipation of future extensions of the GNN to hadronic interactions as well, the PSD information is kept throughout this work.
Table 2: Fit results (a/E gen ⊕ b/ E gen ⊕ c) of the fits shown in Fig. 8.
9: 95 % left-and right tail lengths T L and T R of η gen for the GNN and basf2 as function of the simulated photon energy E gen for both endcaps and the barrel for (a and b) low and (c and d) high beam background.Each color is associated with one detector region.

Overlapping Photons
When discussing overlapping photon events, it is important to note that the FWHM of the photon energy distribution not only depends on its own properties but also on the properties of the second photon present.To account for that, the evaluation is split in energy bins of [0.1, 0.2], [0.2, 0.5], [0.5, 1.0], and [1.0, 1.5] GeV for both photons respectively.We report the FWHM of the first photon for different simulated energies of the second photon for low beam backgrounds (see Tab. 5) and beam backgrounds (Tab.6).
The GNN provides a better FWHM for all combinations, but the improvement is most significant if the photon is low energetic.For low beam backgrounds, the GNN improves the FWHM by up to 20 % for photons with simulated energies between 0.1 < E gen < 0.2 GeV.For high beam backgrounds, the GNN improves the FWHM by more than 35 % for photons with simulated energies between 0.1 < E gen < 0.2 GeV.
The result shows that the significant performance improvement observed for isolated photons can also be achieved for the more complicated overlapping photon signatures.

Conclusion and Outlook
In this work, we have presented a complete study of a GNN-based fuzzy clustering algorithm for the Table 4: Comparison of the performances of GNN models with different additional input features, and the performance of the basf2 baseline.Shown are the FWHM dep and FWHM gen (see Sec. 6), for 200 000 events in the validation data sets (see Sec. 4) with low and high beam background.The data sets cover an energy range of 0.1 < E gen < 1.5 GeV and the full detector range 17 • < θ gen < 150 • and 0 • < ϕ gen < 360 • , each of which in uniform distribution.The uncertainties of the FWHM in each column are correlated since they use the same simulated events.The input features are described in detail in Sec.Belle II electromagnetic calorimeter.We have been using a realistic full detector simulation and simulated beam background for low and high luminosity conditions of Belle II.The GNN algorithm has been compared to the currently used basf2 baseline algorithm.We find a significantly improved resolution of more than 30 % for high beam backgrounds, but also improved performance in reducing the right-side tails of the reconstruction errors that are caused by beam background.Such significant improvements in photon reconstruction performance directly improve the physics reach of Belle II for almost all final states with photons, but also analyses that use missing energy information [21].We also trained different GNNs to separate energy depositions of overlapping photon clusters.The improvement of the energy resolution is up to 30 % for the low energy photon in asymmetric photon pairs.Any improvement in overlapping photon reconstruction has direct implications for the reconstruction of boosted π 0 mesons or axion-like particles with couplings to photons [37].While the basf2 algorithm strictly reconstructs one cluster for each LM, the GNN algorithm only uses the LMs to center the ROI.The GNN algorithm can therefore in principle also be used to reconstruct overlapping photons that only produced one LM (Fig. 11).The extension of the GNN algorithm to such overlapping signatures as well as to charged particles and neutral hadrons will be the focus of follow-up work.Future work is also going to address robustness against varying beam backgrounds explicitly, for example by introducing features that are directly sensitive to beam-background levels.This is the first application of a GNN-based clustering algorithm at Belle II for a realistic detector geometry and realistic and high beam backgrounds.This is also the first time that an algorithm has shown to improve the performance of the photon reconstruction by explicitly including timing information on clustering level at Belle II.for low beam background for the full detector (barrel and endcaps combined).The uncertainties of the FWHM for the two algorithms are correlated for each energy interval since they use the same simulated events.The improvement over the basf2 baseline algorithm is stated in percent for each energy interval.in dependence of the second photon energy E (2) γ for high beam background for the full detector (barrel and endcaps combined).The uncertainties of the FWHM for the two algorithms are correlated for each energy interval since they use the same simulated events.The improvement to the basf2 baseline is stated in percent for each energy interval.
(a) Truth assignment, colors indicate the fraction belonging to each of the photons and beam background.

Fig. 1 :
Fig. 1: Typical event displays showing (left) simulated truth assignments, (center) input variables time, and (right) PSD hadronic energy ratio for (top) isolated and (bottom) overlapping photons for two example events with high beam background.The marker centers indicate the crystal centers, the marker area is proportional to the truth energy deposition for the left plots; it is proportional to the reconstructed crystal energy for the other plots.

Fig. 2 :
Fig.2: Fraction of selected overlapping photon events in the barrel as a function of generated opening angle.The orange markers correspond to events fulfilling LM separation criteria a)-c); the blue markers correspond to events that additionally pass the overlap criterion d) (see text for details).

Fig. 4 :
Fig. 4: Comparison of (a) truth energy fractions , (b) reconstructed energy fraction by the GNN , and (c) reconstructed energy fraction by basf2 for an example event with high beam background.Colors indicate the fractions belonging to each photon or background.The marker centers indicate the crystal centers, the marker area is proportional to the truth or reconstructed (GNN, basf2) energy deposition respectively.

Fig. 5 :
Fig. 5: Example distribution of the relative reconstruction error η gen of the generated energy and illustration of the bias correction, the FWHM, and the tail ranges.
Relative reconstruction error η dep of the deposited energy.Relative reconstruction error ηgen of the generated energy.

Fig. 6 :N
Fig. 6: Distribution of relative reconstruction errors (a) η dep and (b) η gen for isolated clusters for low beam backgrounds.The first bin contains all underflow entries; the last bin contains all overflow entries.

Fig. 7 :
Fig. 7: Distributions of relative reconstruction errors (a) η dep and (b) η gen for isolated clusters for high beam backgrounds.The first bin contains all underflow entries; the last bin contains all overflow entries.
(a) Low beam background.(b) High beam background.

Fig. 8 :
Fig. 8: Resolution FWHM gen /2.355 of the GNN and basf2 as function of the simulated photon energy E gen for both endcaps and the barrel for (a) low and (b) high beam background.Each color is associated with one detector region; the light color indicates basf2, the dark color the GNN.The bands indicate the uncertainty of the fits, see text for details.The fit parameters are summarized in Tab. 2.

Fig. 10 :Table 3 :
Fig. 10: Resolution FWHM gen /2.355 as a function of the simulated photon energy E gen for the GNNs trained with low beam background (LBB GNN) and high beam background (HBB GNN) in the barrel.color is associated with the evaluation on either beam background; the dark color indicates the model trained with the beam background identical to the evaluation, and the light color indicates the model trained with the respective other beam background.The bands indicate the uncertainty of the fits, see text for details.The fit parameters are summarized in Tab. 3. The resolution of the basf2 algorithm is shown for comparison.

Fig. 11 :
Fig. 11: Comparison of truth energy fractions (a), the reconstructed energy fraction by the GNN (b), and the reconstructed energy fraction by basf2 (c) for example event with only one local maximum.Colors indicate the fractions belonging to each photon or background.The marker centers indicate the crystal centers, the marker area is proportional to the reconstructed energy in each crystal.

Truth Belle II (Simulation)
An illustration of the GNN architecture.Each pair of gray, square brackets represents one GravNet block consisting of dense layers, a GravNet layer and a batch norm layer.The input features describe the feature vector of one node.The global exchange denotes appending the average each input features over all nodes in the ROI.connected to its k nearest neighbors defined by the Euclidean distances in S, thus creating an undirected, connected graph.For each node, the input features of connected nodes are then weighted by a Gaussian potential depending on the distance in S and aggregated by summation.The resulting features are concatenated with the GravNet input features and, after batch normalization, passed to the next GravNet block and to the dense output layers.

Table 1 :
Optimized hyperparameters of the isolated photon, and overlapping photon GravNet models.The hyperparameters are the result of an optimization of the FWHM dep on the respective high background validation data set. 3.

Table 5 :
FWHM gen × 10 2 of one photon with photon energy E

Table 6 :
FWHM gen × 10 2 of one photon with photon energy E