Abstract
HERMES Pathfinder is an in-orbit demonstration consisting of a constellation of six 3U nano-satellites hosting simple but innovative detectors for the monitoring of cosmic high-energy transients. The main objective of HERMES Pathfinder is to prove that accurate position of high-energy cosmic transients can be obtained using miniaturized hardware. The transient position is obtained by studying the delay time of arrival of the signal to different detectors hosted by nano-satellites on low-Earth orbits. In this context, we need to develop novel tools to fully exploit the future scientific data output of HERMES Pathfinder. In this paper, we introduce a new framework to assess the background count rate of a spaceborne, high energy detector; a key step towards the identification of faint astrophysical transients. We employ a neural network to estimate the background lightcurves on different timescales. Subsequently, we employ a fast change-point and anomaly detection technique called Poisson-FOCuS to identify observation segments where statistically significant excesses in the observed count rate relative to the background estimate exist. We test the new software on archival data from the NASA Fermi Gamma-ray Burst Monitor (GBM), which has a collecting area and background level of the same order of magnitude to those of HERMES Pathfinder. The neural network performances are discussed and analyzed over period of both high and low solar activity. We were able to confirm events in the Fermi-GBM catalog, both solar flares and gamma-ray bursts, and found events, not present in Fermi-GBM database, that could be attributed to solar flares, terrestrial gamma-ray flashes, gamma-ray bursts and galactic X-ray flashes. Seven of these are selected and further analyzed, providing an estimate of localisation and a tentative classification.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Gamma-Ray Bursts (GRBs) originate in extraordinarily energetic explosions taking place in distant galaxies. They appear as irregular pulses of X and \(\gamma \)-ray radiation in detectors of today work-horse satellites such as SWIFT [1], INTEGRAL [2], Fermi [3], and Agile [4]. The typical distribution of GRBs duration is bimodal; ‘long bursts’, lasting longer than 2s, are associated with black hole formation in collapsars, while ‘short bursts’, lasting less than 2s, are associated to mergers of binary neutron stars [5,6,7,8,9].
Present instrumentation dedicated to GRBs and cosmic transients has been launched during the 2010s. There is no guarantee that it will continue to operate beyond the mid-2020s. For this reason, several proposals to NASA and ESA have been already submitted to select the successors of these instruments (such as THESEUS [10], SVOM [11], e-ASTROGAM [12], AMEGO-X [13] and several CubeSats missions [14, 15]). The High Energy Rapid Modular Ensemble of Satellites (HERMES) concept is to develop a constellation of nano-satellites to study high-energy transients [16, 17], thus providing a fast-track and affordable solution bridging the gap between current X-ray monitors and the next generation. A technological and scientific pathfinder (HERMES-TP, funded by ASI and HERMES-SP funded by the European Commission, HERMES Pathfinder hereafter) is in preparation to prove the concept, that is the capability to detect and localize GRBs with miniaturized instrumentation hosted by nano-satellites. The first six HERMES Pathfinder spacecrafts are expected to be launched in low-Earth, near-equatorial orbit during 2024. A seventh payload unit identical to those hosted by HERMES Pathfinder will be hosted by SpIRIT [18], an Australia-Italy nano-satellite mission planned for launch in 2023 and developed by a consortium led by the University of Melbourne. SpIRIT will be the only satellite among the HERMES Pathfinder constellation to be launched into a polar orbit, improving the localization capability of the whole constellation [19]. The HERMES Pathfinder and SpIRIT payload is a small yet innovative “siswich" detector providing broad-band energy coverage (few keV - 1 MeV) and very good temporal resolution (a few hundreds ns) [20,21,22,23].
GRBs manifest as transient increases in the count rates of detectors. The activity of these phenomena appear as unexpected, and not explainable in terms of background or any other known sources. Any automated procedure for detecting GRBs is generally concerned with searching the time series of the observations for statistically significant excesses in photon count rates, relative to a reference background estimate in the absence of \(\gamma \)/X-ray GRB related events. The on-orbit physical background observed by GRB monitor experiments is determined by factors inherent to the highly dynamical near-Earth radiation environment, to the spacecraft geographic position and attitude, as well as the spacecraft geometry, and the detector’s pointing, design and response. Given the difficulty intrinsic to a real-time modelling of the expected scientific background, algorithms dedicated to the ‘online’ search of GRBs often resort to extrapolate the background from recent observations. For example, the trigger algorithms running on-board NASA Fermi-GBM assess a background estimate from an average of the photon count rates observed over the previous 17 s excluding the most recent 4 s of observations [3]; similar moving average approaches were used by Compton-BATSE [24] and BeppoSAX-GRBM [25].
In ‘offline’ analysis, archival data are searched for GRB events that the online and on-board algorithms may have missed. Examples of this approach can be found in Kommers et al. (1999) [26], which uses the BATSE catalog, or in Kocevksi et al. (2018) [27] and Hui et al. (2017) [28] where they search for faint, short GRBs at times compatible with known gravitational wave events. In Biltzinger et al. (2020) [29] for example, an estimate is assessed starting from detailed models of the background expected for GBM, such as the detector response, the cosmic \(\gamma \)-ray background, the solar activity, the geomagnetic environment, the Earth albedo and the visibility of X and \(\gamma \) point sources. The background description so achieved has been shown to reproduce very well the observations of Fermi-GBM and could potentially allow for the identification of otherwise hard to detect GRBs such as long-weak events with slow raising times. However, having been specifically tailored for the observations of Fermi-GBM, this technique is not immediately applicable to other experiments. In Sadeh (2019) [30] a Recurrent Neural Network (RNN [31]) is used to predict the background and, on top of it, classify or detect anomalies in the observations of a count rate detector. To recognize a GRB event, this RNN is trained onto existing catalogues of burst observations. We believe such an approach could inherit the detection biases of standard strategies for GRB detection, ultimately leading to missing events which already defied previous searches.
In Section 2 we introduce our approach to estimate the scientific background of a gamma-ray burst monitor experiment using a Neural Network (NN). In particular, we employ a Feed Forward Neural Network ([32, 33]) to estimate the count rate expected from background sources over the 12 NaI detectors of Fermi-GBM, in different energy bands and at regular time intervals. Our model is designed to learn the dynamics of the background over a timescale of months, enabling the detection of long and eventually faint GRBs.
In the literature, there are studies focusing on faint events, such as the Low-Luminosity GRB (LLGRB) [34], as well as events with duration of hundreds to thousands seconds, the so-called ultra-long GRBs [34,35,36,37]. Currently, there is no consensus on a clear distinction between long and ultra-long GRBs, although the latter may have different progenitors, such blue supergiants with a low metallicity (GRB 111209A [38, 39]) or magnetars [40, 41]. For GRB 101225A, also known as Christmas burst, it has been proposed that the emission might be originated by the tidal shredding of an asteroid by a neutron star, or a burst in coincidence with a supernova inside a dense envelope. For GRB 110328A it has been proposed that the emission might be originated by tidal disruption event caused by a star falling in a supermassive black hole [34].
Estimating the burst duration using classical methods like T90 is challenging because the duration of the burst depends on the observing band and the prompt phase could spread across thousands of seconds and therefore including gaps in signal, due for example passages of the satellites through the South Atlantic Anomaly or around the Poles, where the particle background is too high to allow normal operation of X-ray and gamma-ray instruments, or due to reorientation of the satellite because of download of the data over a ground station. These factors make the estimation of burst duration more complex and require careful consideration in the analysis [38], in particular in the estimate of the background. As an example, we can refer to the estimated duration of three ultra-long GRBs discussed in Levan et al. (2013) [34]: GRB 101225A, with an estimated prompt emission duration exceeding 7000s, GRB 111209A about 10000s and GRB 121027A about 6000s. Appendix A provides a background estimation around the ultra-long GRB 091024 with prompt duration of about 1020s [42]. Moreover, employing a robust loss function in the training phase, we are able to deal with outliers in count rate observations, such as transients due to astronomical events or brief period of detector high/low activity, see Section 4.2. The choice of applying our framework to archival Fermi data was motivated by the facts that (1) the HERMES Pathfinder spacecrafts are expected to be launched in a low inclination orbit with altitude \(500-550\) km, an orbit where the background and its variations are expected to be smaller than those of Fermi-GBM [3]; and (2) the Fermi-GBM and HERMES Pathfinder detectors both rely onto scintillators and have similar collecting areas [43,44,45] resulting in background count rates of the same order of magnitude. To estimate the background observed by Fermi-GBM, we leverage on a large ensemble of information, including features both intrinsic to the satellite and its orbital setting such as the satellite attitude and geographic location in time, the Sun visibility and so on. These features are expected to be independent of events such as GRBs. This idea is consistent with Fitzpatrick et al. (2012) [46], which describes a method that estimates the background at the period of interest by using count rates from adjacent days when the satellite has similar geographical footprint. To retrieve these information’s we use the Fermi-GBM Data Tools [47] software package, an Application Programming Interface (API) allowing to download, analyse and visualise GBM data. Being completely data-driven, we believe our approach to be in principle applicable to any GRB monitor experiment for which a similar dataset is available.
The background estimates produced by the NN are compared with the observations by mean of an efficient change-point detection technique called Poisson-FOCuS [48, 49], aiming at the automatic identification of statistically significant astrophysical transients. We tested the combination of the NN background estimates and Poisson-FOCuS trigger on real Fermi-GBM data. We were able to confirm known events, but we also find events with no counterpart in the Fermi-GBM trigger catalogFootnote 1 [50], yet with features resembling astronomical transients such as GRBs and solar flares and other galactic high-energy sources.
The paper is organised as follows. In Section 2 we present the background estimation in a supervised Machine Learning settings, the architecture of the NN and the Poisson-FOCuS change-point detection technique. In Section 3 we describe the data used and the pre-processing steps to build the dataset. In Section 4 we report the performance of the NN estimator and the result of the application of the trigger algorithm. A comparison between the background estimated in a period of solar maxima and in a solar minima is described in the Section 4.2. In Section 5 we discuss the results of our search for undiscovered astrophysical transients. We identify 110 events with no counterpart in GBM trigger catalog over a period of about 9 months. We report on a subset of seven events providing lightcurves, localization and classification. Finally, in Section 6 we draw our conclusions and discuss future prospects.
2 Methodology
The background assessment problem is expressed as a supervised Machine Learning estimator, with the variables inherent to the satellite and its orbital position as inputs and the count rate observed by each detector in three different energy bands as outputs. The background estimates so obtained are compared against the actual observations using Poisson-FOCuS. The significance of the excess in the count rate observations relative to the background model is quantified in units of standard deviations and recorded as a time series. Finally, these records are searched for intervals in the observation where the excess significance exceeds a threshold over one or more detector-energy band combinations.
2.1 Background estimation
We define X as the input variables, see \(col\_sat\_pos\) and \(col\_det\_pos\) in Section 3, and Y as the output variables, see \(col\_range\) in Section 3. We suppose that a function f(X) exists which predict Y given X, that is the solution that minimize L(f(x), Y) (\(\text {argmin}_f L(f(x), Y)\)) where L is the loss function that quantify the error in the predictions. The model’s goal is to estimate a quantity F(x) such that \(f(x) \approx F(x)\) [51]. Here we are dealing with a multi-output regression: \(F: X \in \mathbb {R}^{k} \longrightarrow Y \in \mathbb {R}^{m}\), where k is the number of features into the model and m the number of outputs.
The model employed is a feed forward neural network with 3 hidden dense layers. Each hidden layer is followed by a batch normalization layer [52] and a dropout layer [53]. The input layer has dimension \(k=60\). Each of the first two hidden layers is composed of 2048 neurons, while the third hidden layer hosts 1024 neurons. The last (output) layer has \(m=36\) neurons. Each of the output neurons is associated with a particular detector-energy combination. The probability parameter for the drouputs is 0.02. The optimizer used is Nadam [54] with learning rate \(\eta \) varying accordingly to (1), \(\beta _1 = 0.9\), \(\beta _2 = 0.99\) and \(\epsilon = 10^{-7}\).
We run the fitting for 64 epochs with a batch size of 2048.
Other neural network architectures were considered during the design process. For instance, the background estimation could be approached by utilizing sequential count rates to predict future ones, i.e. employing a RNN. This approach has been discussed in the literature, such as in the work of Sadeh (2019) [30]. Training an RNN to predict background count rates, it is crucial to exclude periods that contain astrophysical transients from the training dataset. This prevents the RNN from learning the count rate dynamics in a way that would make it difficult to distinguish astrophysical transients from the actual background. To filter transients such as GRBs from the training dataset, the presence of these events should be known in advance. This implies relying on existing catalogs of transient astrophysical phenomena. We believe such an approach could result in the model inheriting the detection biases of standard strategies for GRB detection. This scenario would prove detrimental to the present work, as our goal is precisely to detect transients that may have evaded previous searches. On the other hand, our approach differs in that we utilize input features related to the satellite/detector, which should be independent of events like GRBs, to estimate the expected count rates for each detector. This "mapping" from the satellite configuration to the expected count rates is currently accomplished through the previously described FFNN, but could in line of principle be extended by incorporating an RNN that considers the previous satellite configurations.
In a pre-processing step, the input training dataset is standardised and filtered. Data filtering takes place in two steps in which the following data subsets are removed:
-
data collected while Fermi is transiting through the high radiation environment of the South Atlantic Anomaly (SAA).
-
data acquired at times in which an event of the Fermi-GBM trigger catalog occurred.
This latter choice isn’t strictly necessary, yet it is useful to better understand the neural network performances over known events. The splitting procedure divides the dataset into 75% for training and 25% for testing; 30% of the training set is further kept as validation set. The resulting splitting is 52% for training, 23% for validation and 25% for testing. The instances inside these sets are not sequential but rather taken randomly.
The purpose of our framework is to evaluate the effectiveness of our model on a known dataset, hence the choice of a loss function L which is robust against outliers is critical. The Mean Square Error loss function (MSE) is:
where n is the total number of sample in training set, i refers to the specific sample, \(y_i\) the target value (the observed count rate) and \(z_i\) is the estimated value (the estimated count rate). MSE is very sensitive to the discrepancy between the prediction and the target value, thus it is a bad choice when outliers are present in the training dataset. We remark that the filtering of catalog events is not enough to guarantee the optimization of the background estimator when using MSE. Indeed, anomalous events, which are not present in the GBM catalog, may be over-fitted when minimizing MSE; these events are the actual targets of our search.
The Mean Absolute Error (MAE) loss function is less sensitive to residuals:
the term are the same as in (2).
When anomalous events are included in the training dataset, the use of MAE instead of MSE can lead to a neural network less prone to overfitting, as discussed in Appendix F.1.
In the settings of multi-output regression, the overall loss \(\mathcal {L}\) is define as the MAE average of the NN outputs:
where j a specific detector/energy range, m the total number of detector/energy range, \(X \in \mathbb {R}^{n,k}\) the input feature matrix of the NN (samples times features), \(Z=\{F(X_i), \, i=1:n\} \in \mathbb {R}^{n,m}\) is the Neural Network outputs (estimated count rates per each detector/energy range), \(Y \in \mathbb {R}^{n,m}\) the observed count rates for each detector/energy range.
For evaluation purposes, the Median Absolute Error (MeAE) is employed because of its robustness against the outliers
where the terms are the same as in (2).
A diagram representing the pipeline’s transient search component. Poisson-FOCuS is given in input two tables. The first input table contains the NN’s background count rate prediction, while the second reports the actual observations. The container outputs a table with same dimension as the inputs, and values representing statistical significance in unit of standard deviations. All tables share the same dimension and organization: columns are used to represent different combinations of detectors and energy ranges, while rows are used to represent different times. The output table is searched for time intervals in which statistical significance exceeds the threshold value over the energy \(50-300\) keV (r1). Then, intervals close in time and exceeding the threshold are clustered together. Finally, clustered over-threshold intervals are reported in a list
2.2 Trigger algorithm
An efficient change-point and anomaly detection algorithm called Poisson-FOCuS (Functional Online CUSUM) [48, 49] is employed to find anomalous transients in Fermi-GBM CSPEC data—photon counts with bin-length 4.096 s, see the discussion of Section 3—relative to the NN estimates of the background.
The Poisson-FOCuS algorithm is executed sequentially over the time series of the observed count rate data and the background estimates, separately for each combination of detectors and energy range. For a given detector-energy range combination with label i and a given time step t, Poisson-FOCuS outputs an estimate of the maximum significance in the observed count rate excess relative to the background, \(m_t^{(i)}\). This value is computed over an optimal time interval ending at t and starting at a past time-step \(t - d\). Crucially, the interval length d is not predetermined but rather assessed and optimized by the algorithm itself, conditionally on the observations. The significance values \(m_t^{(i)}\) are recorded, in units of standard deviations, in a table with dimensions \(M \times N\), where M equals the length of the input time series and N equals the number of detector-energy range combination. From these table, candidate transients are extrapolated in two steps. The first step is to identify vertical table slices (time intervals, rows segments) where the trigger condition is verified (e.g., the times when the significance of a detector-range combination exceeds a pre-set threshold), see Fig. 1. The second step is to cluster together segments whose start and end times are closer than a pre-defined value. The user controls the search’s output through three parameters. For the trigger condition to be verified it is required that the significance values exceed a threshold parameter T over a minimum number detectors and energy ranges. Additionally, the user can limit the choice of the best interval to those whose length does not exceed a value \(d_{\text {max}}\) or whose average intensity, given as a multiplicative factor of the observed count rates in relation to the integral of background values, is greater than a minimum \(\mu _{\text {min}}\).
3 Data
The Fermi-GBM daily |CSPEC| data products were used for both the testing and the training of the neural network and for searching astrophysical transient events with Poisson-FOCuS. These data are photon count rates (with unit counts/s) over a duration of 4.096 s, binned over 128 logarithmically spaced energy channels spanning from \(\approx 8\) keV to \(\approx 900\) keV [3, 55]. The time resolution provided by |CSPEC| data is high enough to investigate long and ultra-long GRBs, yet it is too low to reliably identify short GRBs and other transients with characteristic duration shorter than a few seconds. This is unfortunate, yet justified for our use-case. Indeed, the variability of background over time intervals of duration comparable to the duration of short GRBs is negligible, hence our method provides little benefits relative to simpler approaches such as moving average or exponential smoothing. On the other hand, an accurate description of the background become essential when searching for long, faint events, in particular events whose duration is comparable to that of the Fermi orbit such as ultra-long GRBs.
We consider data from all of the Fermi-GBM’s twelve NaI detectors. Each detector is identified according to the standard GBM nomenclature (ten detectors labelled with integers ranging from 0 to 9, two detectors are identified by the letters a and b). In our analysis we disregard the Fermi-GBM bismuth-germanate detectors. These instruments are in fact sensible to energies much greater than the energies typically involved with GRBs prompt emission and are mainly used for the detection and observation of phenomena different from GRBs, such as Terrestrial Gamma-Ray Flashes (TGF) [50]. To build the target variables Y, the input |CSPEC| data from each detector are binned anew, this time over three coarser energy ranges (28-50 keV, 50-300 keV and 300-500 keV, see Table 1). The resulting dataset is arranged in a table with 36 columns, one for each of the 36 detector-energy combinations.
Beside the |CSPEC| data product, the neural network is trained using information on the satellite geographical location and the detectors pointing direction, as well as a number of auxiliary features such as the Earth occultation status and the visibility of the Sun for each detector at a given time. These informations are gathered from the Fermi-GBM |POSHIST| data products. A detail of the orbital and detectors features used in the training of the NN is given in Tables 2 and 3.
Sources of background for high-energy count detectors in low-Earth, near-equatorial orbits have been discussed thoroughly in literature (see [29, 56] for discussions relevant to the present context). Our choice of the NN feature inputs was designed to provide a sufficient description of the different background components along the Fermi’s orbit. For example, the instantaneous rate of primary and secondary cosmic ray particles, a majour component of the background, will change depending on the the spacecraft geographical latitude, altitude, as well as the McIlwain parameter. Furthermore, the intensity of the cosmic photon background is influenced by the spacecraft’s attitude and the position of the Earth in a detector’s field of view. Leveraging information such as the pointing direction of different detectors and the spacecraft attitude can also aid the neural network in predicting the impact of point sources on the instantaneous background rate; as considering the presence of Earth in the field of view of a detector can potentially improve the neural network’s ability to resolve the impact of components such as the albedo Gamma-ray and neutron background. The rate of change of individual components is influenced by the spacecraft’s velocity in the Earth’s inertial system and the angular velocity of the spacecraft itself. Special flags were utilized to indicate the presence of the Sun in a detector’s field of view, as well as transits through the high-radiation environment of the SAA. The resulting input datasets X include a total of 60 different features, sampled with a step length of 4.096 s.
4 Results
In this section we present the results of the background estimator and the trigger algorithm application. The open source code implementation is available on github.com/rcrupi/DeepGRB.
Fermi-GBM NaI-4 detector photon count rates (crosses) in the energy range 50 - 300 keV (r1) versus the respective prediction from the Neural Network (red solid lines). The lower panels show the residuals and relative change percentage between the two quantities, with a black solid line denoting the reference of null residual. Data span 1400 s and one SAA crossing. Anomalously low count rates were observed immediately before and after the instrument’s switch-off during the SAA crossing. These values manifest as outliers above the bisector of Fig. 4
The background estimation for the n4 detector, in the energy range r1, during 21 May 2019. The Fermi-GBM count rate observations are represented over time as a black line, whereas the neural network estimation is plotted as a red solid line. The middle panel shows the residuals between the two quantities, with a black solid line denoting the reference of null residual. The lower panel shows the residuals as relative change percentage
4.1 Background estimator performance
To show the effectiveness of this approach, a NN is trained over 7 months of data from January to July 2019. An excerpt of the resulting background estimation is presented in Figs. 2 and 3 for one detector-range combination, specifically Fig. 3 during a day without any events in the Fermi-GBM trigger catalog. The MAE values are reported in Table 4. The energy range bins are the same as those used in Section 3 and are defined in Table 1.
One qualitative approach for assessing the quality of a background estimator is to estimate the background during an event and then see whether the residuals can emerge clearly and if the dynamics estimated are coherent before, during, and after the event. In Appendix A, the NN background prediction over a dataset comprising GRB 091024 is similar to the background of an established physical Fermi-GBM background model [29] except for detector n6 in range r0, where the NN shows an higher residual before the event begins. In Fig. 4 we plot the NN predictions against the corresponding observed value, in particular we filtered out the data points 150 s before and after the SAA, specifically if the satellite remains in the SAA for at least 500s.
Transient, bright events such as GRBs result in a temporary increase of the observed count rates (see Fig. 5) and, taking place at random times and directions, are not predictable from features intrinsic to the Fermi spacecraft motion and attitude, which are the actual inputs of the NN. Hence, these events are found below the bisector (see the horizontal lines in Fig. 4), where the observed count rates exceed the NN prediction. For data points close to the bisector, the NN’s predictions closely match the observed count rates. This group constitutes the majority of the dataset and can be attributed to background sources. In contrast, a few outliers are positioned above the bisector, where the NN’s predictions exceed the observed count rates. An illustrative example of this phenomenon is provided in Fig. 2, where these outliers are often encountered in close proximity to SAA transits, coinciding when the Fermi-GBM instruments are switched off.
Fermi-GBM photons count rates from NaI-8 detector in the energy range 50 - 300 keV (r1) versus the respective prediction from the NN over the same combination of detector and energy range. Data spans from 1 January 2019 to 1 July 2019. The three white lines represent the contour plots at 1\(\sigma \), 2\(\sigma \) and 3\(\sigma \)
Observed and background estimation count rates for detector n8 energy range r1, around the GRB 190507970, with residual difference and residual as relative change percentage. These event is visible as the seven horizontal data points on the bottom right of the ellipse in Fig. 4
The three time periods chosen for the application of the trigger algorithm spans 1 November 2010 to 19 February 2011, 1 January 2014 to 28 February 2014, 1 March 2019 to 9 July 2019. For the sake of brevity, we will refer to these epochs as the ‘2010’, the ‘2014’ and the ‘2019’ periods. These periods are chosen to test the framework under a variety of conditions, including solar activity intensities and potential detector degradation.
A separate NN is trained and tested for each of these periods to account for variations in background count rates over long time scales (years), which may be caused by factors such as the solar activity and detector degradation. We report the performance metrics in Table 5.
Additional details on the neural network’s performance during times of both high and low solar activity are provided in the following section, whereas Section F is dedicated to the choice of the NN’s hyperparameterers.
4.2 Solar minima and maxima
Hermes Pathfinder will be launched in 2024 that is near the next solar maximum forecast in 2025 [57, 58]. This analysis is interesting because reveals what background is expected and how the NN background estimation performs in the two periods. The most sensitive detector for the solar activity is the Sun-facing n5 [3]. In this analysis are considered background binned in a GBM period orbits (about 96m) and 16 GBM period orbits, for range 0, the most sensitive for solar flares, in the year of the last solar minima, 2014, and the local minima, 2020. The Figs. 12, 13, 14, 15, and 16 are obtained considering respectively years 2014 and 2020, a NN per each year is trained. One orbit time binning for 2020 Fig. 12, around 240 counts/s, and 2014 Fig. 14 are not comparable due to the high values of the latter but if we zoom the estimated background part, Fig. 13, we see count rates around 225 counts/s. The same reasoning applies for 16 orbits in 2020 Fig. 15 and 2014 Fig. 16. In Table 6 are presented the performance of the background estimation for the year 2014 and 2020.
The solar activity is known to follow a cycle of 11 years [59]. For periods consisting in few months we can assume the solar activity to be constant. Some reference for the solar cycle prediction can be found in Hathaway et al. (1994) [60], Upton et al. (2018) [61], Bhowmik et al. (2018) [59].
4.3 Transients detection
With reference to the technique described in Section 2.2, the following detection parameters were used to obtain the results discussed in this section. The trigger condition was defined to resolve whenever at least one detector observed enough count rate for the significance level to exceed a threshold \(T = 3\sigma \) over the range of energy spanning 50 keV and 300 keV. This choice was made to ensure comparability with the approach used by the online search algorithm of Fermi-GBM and other major GRB monitoring experiments [26, 50], as well as to filter out softer events such as solar flares. Subsequent segments resolving the trigger condition were clustered together if closer than 600 seconds, a duration large enough to capture most long GRBs and equal to the duration of the Fermi-GBM time-tagged event lightcurves [62]. The Poisson-FOCuS algorithm was executed with the parameters \(d_{\text {max}}\) and \(\mu _{\text {min}}\) set to the values 120.4 s and 1.2 s, respectively. The choice of these parameters was driven by a trade-off between the need to find most astrophysical transients in our dataset both known and potentially unknown and the need to minimize the rate of false detection.
A filter was applied to exclude data points within 150 seconds before and after a SAA transit, specifically if the satellite remains in the SAA for at least 500s. The purpose of filtering out data in proximity to the SAA is to reduce false detections. This is necessary due to various factors including the dynamic nature of the SAA environment, even on short time-scales [63], the spacecraft’s apparent direct motion (Fermi enters the SAA at different geographic locations during each orbit), and the presence of a discontinuity in the observed data resulting from the instrument switch-off during the SAA transit. These factors make estimating a reliable background count rate near the SAA challenging, often resulting in an underestimation of the background rate and, in turn, false detections by the trigger algorithm. Through empirical analysis, we have determined that a filter duration of 150 seconds is the minimum required to ensure accurate estimation of background count rates. However, this precaution has the unfortunate consequence of preventing the detection of transients that occur during these filtered periods.
The transient search was performed over three distinct time periods, as defined in the previous section. In the period spanning March 2019 and July 2019 a total of 100 events were identified. Of these, 74 events match the trigger time of events already in the Fermi-GBM Trigger Catalog [50], one event is due to artifacts in the dataset, while the nature of the remaining 25 events is uncertain. These results, along other from the remaining test periods, have been summarized in Table 7. Over the same period, the Fermi-GBM Burst Catalog [50] reports on 96 known GRBs. Of these bursts, 15 are missing a counterpart in our dataset due to the clipping of data 150 s before and after a SAA transit. Of the remaining 81 bursts (65 detected and 16 undetected), 68 have \(T_{90}\) duration larger than the bin-length resolution of our dataset (4.096 s). We were able to correctly identify 60 of these bursts (\(88 \%\)). Finally, we detected 5 out of 13 (\(34\%\)) GRBs with \(T_{90}\) duration inferior to the the bin-lenght resolution of our dataset. These results are summarized in Fig. 6 and Table 8, the latter also reporting on results from other periods.
GRB detection performances. Each dot represents a gamma-ray burst of the Fermi-GBM Burst Catalog discovered between March 1st and July 1st 2019 over the space spanned by the GRB’s duration \(T_{90}\) and flux, the latter computed as the ratio between the catalog’s GRB fluence in band 10-1000 keV and \(T_{90}\)). Events in the shaded grey region have \(T_{90}\) duration smaller than the bin-length time resolution of the dataset tested with the present framework (4.096 s, CSPEC data). Colors are used to identify the detection status within our search. In red the events unidentified with our method. Missing events (no data) are due to clipping of data 150 s before and after a SAA transit or portion of data that could not be preprocessed
To measure the significance of the events in Table 8 the Standard Score z is computed:
where \(\mu \) is the mean and \(\sigma \) the standard deviation of the distribution \(\mathcal {X}\). Since we are dealing with count rates that follows the Poisson distribution, with sufficiently high count rates we can consider \(\mu \approx \sigma ^2\). Then the Standard Score can be approximated to:
where N is the observed count rates integrated over an interval spanning the event’s start time and end timeFootnote 2 and over each triggered detectors. B is the total count rates comes from the background estimated by the NN, over the same event time. Standard Score is determined independently for each energy range \(S_{r0}\), \(S_{r1}\) and \(S_{r2}\). The overall consistency for the event is defined as:
5 Discussion
According to Tables 4 the test set and train set MAE values are similar up to \(1 \%\) indicating no over-fitting and strong generalization across energy range and detector. Table 5 shows that the neural network trained on data from the 2014 period has the highest (worst) MAE, which can be attributed to the presence of strong solar activity. This is understandable since, during an activity maximum, the background particle count rate is more unpredictable due to the influence of the Sun on the local radiation environment (see Fig. 7b). Nonetheless, MeAE shows similar performance with the other two periods, thanks to its robustness against outliers. On the other hand, the 2019 period has the lowest MAE most likely due to low solar activity and low background variability.
Similar conclusions can be derived from the analysis presented in Section 4.2. The performance of the two neural networks trained on the complete data from 2014 and 2019 is nearly identical in terms of MeAE, see Table 6. This suggests a comparable central tendency of the residuals in both periods.
It is important to note that the performance results are presented for both metrics, as they provide complementary information about the algorithm’s performance. During the 2014 period, which had a solar maximum, the MAE was significantly larger than the MeAE due to the inclusion of 71 transient events not found in the Fermi-GBM trigger catalog (see Table 7). These events, some of which are likely of solar origin, affected the MAE more than the MeAE. However, it is important to note that a low MeAE does not guarantee a perfect background estimation, as indicated by a high number of false detections in 2014 (Table 7). Factors such as the inclusion of luminous solar transients and the reduced training dataset length can contribute to background estimation issues during this period.
In Fig. 4, most of the data points are distributed along the plot bisector \(y = x\), indicating that most often the neural network prediction is in agreement with the actual observations. Above the bisector, more count rates are expected than they are actually observed. From spot analysis, it is observed that outliers in this domain correspond to anomalously low values in the observed count rates. Most of these outliers are encountered in immediate proximity to SAA transits (for example, see Fig. 2) when the Fermi-GBM instruments are switched on and off.
In periods of high solar activity, Fermi-GBM data include a large number of soft transient events of solar origin; thus, the soft (25 - 50 keV) trigger conditions have been disabled on multiple occasions (e.g. see Table 4 in [64] for 2014). Likewise, we required that at least one detector must be over threshold in the energy band spanning 50 and 300 keV in order for the trigger condition to be satisfied. Still, Table 7 shows a higher number of total events for the 2014 period. The majority of these events are most likely associated to solar flares; indeed, 50 of the 81 events in the GBM trigger catalog for this period are solar flares, and the majority of the events we find with no counterpart in the Fermi-GBM trigger catalog are triggered over Sun facing detectors (n0, n1, n2, n3, n4, n5). False detections may be caused by artifacts in the background estimation. These are generally easy to identify; most of the time these artifacts take the form of sudden steps in the background estimate, simultaneously over all detector/range combination. One of these events is represented in Fig. 7a. This behaviour is less frequently present in the other two periods analyzed, indicating that noisy background impacts on performance (see MAE) and therefore more false positive are detected. This issue should be investigated in future work, for instance integrating explainability techniques in the NN or implementing different architectures, such as RNNs.
Photon count rates from each triggered detector are plotted with step lines, across three energy bands spanning \(28-50\) keV, \(50-300\) keV and \(300-500\) keV (Table 1), with a resolution of 4.096 s. The neural network’s prediction of background count rates is represented by solid lines. Different detectors are identified using different colors. A red shaded area limits Poisson-FOCuS’s best guess of the transient duration. Times are expressed in units of seconds according to Fermi’s standard mission elapsed time (MET). (a) Example of False Detection in which all the detector are triggered over an imprecision of the Neural Network estimation. (b) Example of a solar flare in the Fermi-GBM catalog detected by our approach. The event start and end MET time, as reported in the Fermi-GBM trigger catalog, is represented by a grey shaded area
To further investigate the detected and undetected GRBs, we plot the flux (total fluence divided by T90) vs T90 for our triggered events in Fig. 6. The red points are events reported in the Fermi-GBM catalog but undetected by our method. The Fermi-GBM events with a duration less than our time binning (4.096s) are often undetected in our analysis because of the too coarse binning. We also miss a few longer events with low count rates. Reducing the time binning by using data with higher time resolution, such as CTIME or TTE, could be beneficial to capture shorter and fainter events. Despite the unfavorable adopted time binning of 4.096s, we recovered \(\ge 75\%\) of the GRBs with \(T_{90}\) greater than 4.096s, see Table 8.
We also detect many events not present in the Fermi-GBM catalog, and we use the methodology outlined in Kommers et al. (1999) [26] to characterize these transients. More specifically, we classify events as:
-
Solar flare (SF) when the majority of the count rates are in the low-energy range and the Sun is in the field of view of the triggered detectors.
-
Terrestrial Gamma-ray Flash (TGF) when most of the count rates are in the high-energy range and the event’s source reaches the detector from the Earth’s horizon.
-
Gamma-ray burst (GRB) when most of the count rates are in the \(50-300\)keV energy range, and the source direction is not occulted by the Earth and is distant from both the Sun and the galactic plane.
-
Galactic X-ray flash (GF) when the source direction is compatible with that of the galactic plane.
-
Uncertain (UNC) in all other cases.
To determine the source direction, we employ a simple method based on the evaluation of the pointing and the relative photon count rate of the detectors. Further details can be found in Appendix B.
Two classes of transient events are discussed further in this section: events already classified as GRBs in the Fermi-GBM trigger catalog; events not present in the Fermi-GBM catalog but classified by us as candidate GRBs. We report In Table 9 six more events that have no catalog counterpart, suggesting one or more of the previously mentioned categories. All these events are a cherry pick selection of the unknown events in Table 10.
5.1 GRB 190320A
At 01:14:16 UTC on March 20, 2019, the long GRB 190320052 triggered the Fermi-GBM on board trigger algorithm across detectors n6 and n9. The estimated \(T_{90}\) duration is 43 s, with the highest emission component in the 50-300 keV band. In our analysis, the detectors n6, n7, n8, n9 and na all exceeded a 3.0 \(\sigma \) significance threshold during the period event (Fig. 8) with a resulting consistency greater than 10 on energy range r1 and 5.74 on r2. The background estimate is comparable to a second order polynomial fitting in the soft energy range and first order polynomial fitting in the 50-300 keV energy range.
The Fermi-GBM catalog GRB190320, as detected by our method. Photon count rates from each triggered detector are plotted with step lines, across three energy bands spanning \(28-50\) keV, \(50-300\) keV and \(300-500\) keV (Table 1), with a resolution of 4.096 s. The neural network’s prediction of background count rates is represented by solid lines. Different detectors are identified using different colors. The GRB start and end MET time, as reported in the Fermi-GBM burst catalog, is represented by a grey shaded area. A red shaded area limits Poisson-FOCuS’s best guess of the transient duration. Times are expressed in units of seconds according Fermi’s standard mission elapsed time (MET)
5.2 Event 190420939
Figure 9 shows an event not present in the GBM trigger catalog, similar to GRB190320052 but with higher low-energy count rate. The event has been triggered by detectors n6, n7, n8, na and nb in the low energy band with a consistency greater than 10. Two detectors provided a trigger in the 50-300 keV energy band, with a consistency of 8.4.
The 190420939 transient event with no direct counter part in the Fermi-GBM trigger catalog. The event was classified as a candidate gamma-ray burst, according to the discussion presented in Section 5. For the corresponding localization see Fig. 10. Photon count rates from each triggered detector are plotted with step lines, across three energy bands spanning \(28-50\) keV, \(50-300\) keV and \(300-500\) keV (Table 1), with a resolution of 4.096 s. The neural network’s prediction of background count rates is represented by solid lines. Different detectors are identified using different colors. A red shaded area limits Poisson-FOCuS’s best guess of the transient duration. Times are expressed in units of seconds according Fermi’s standard mission elapsed time (MET)
We can see from the localization estimate in Fig. 10 that the event is far from the galactic plane, the Sun, and the Earth’s horizon. With all of this information, this event could be a long soft GRB. The localization algorithm used is described in detail in Section B.
5.3 Interesting events
We list in Table 9 a selection of interesting events, including the one already discussed, which are not present in the GBM catalog and which deserve further analysis. Appendix D present plots associated to these events. Events 1 and 2 are classified as Solar Flares because their location is close to the Sun and the majority of the detectors triggered are in the energy range r0. Because event 3 is far from the Sun yet close to the galactic plane and the Earth’s horizon, it might be a Galactic X-ray flash or a Terrestrial Gamma-ray Flash. Event 4 and 6 are categorized as GRBs for the same reasons as event 5, however because they are near the galactic plane, event 6 might be a Galactic X-ray burst. Finally, in event 7, nine detectors with roughly equal intensities are triggered, suggesting that this event is likely due to Local Particles. This is further validated by the satellite’s position at high geomagnetic latitude (Fig. 23), which is highly correlated with the localization of charged particle events [50]; as a result, the event is classified as uncertain.
It’s worth noting that GRB 190404B GCN Circular notice discovered by Monitor of all-sky X-ray image (MAXI) satelliteFootnote 3 has location \((\text {RA} = 221^{\circ }, \text {Dec} = -22^{\circ })\), which is similar to event 4, and trigger time 2019/04/04 13:14:34.00 UTC, which is six minutes after event 4.
The complete catalog of unknown and known events for the three time periods analyzed can be found in Appendix in Tables 10 and 11, respectively. The events are reported with the trigger time, duration, the triggered detectors, the Standard Score for each energy range, and a significance classification. Unknown events were assigned tentative transient classes using the methodology described in this section.
6 Conclusion
A novel method for high-energy, transient event detection is presented, integrating the precise estimation of a NN with an efficient trigger algorithm. The method has been designed to be applied to HERMES Pathfinder data, but it can be extended to analyze data from other space-based, high-energy missions and we have presented here an application using Fermi-GBM data. The first step is to estimate the background count rate with a NN using satellite data that may be used to build a physical background model. The accuracy of the background estimate is measured using Mean Absolute Error and Median Absolute Error. An experiment is carried out to assess the robustness of the background estimator during the periods of solar maxima (2014) and solar minima (2020), demonstrating that the background estimation is stable enough to have comparable performance in both periods. Because HERMES Pathfinder will be deployed near the next solar maximum, a scenario of expected count rates is provided in Section 4. The background is then used by Poisson-FOCuS, an evolution of the CUSUM algorithm, to efficiently detect the transient events. This method is tested using three periods of Fermi-GBM data binned in time for 4s. We provide statistics on known and unknown transients in the GBM catalog. We show that with our method we are able to recover known events longer than 4s, and to selected events not included in the Fermi-GBM catalog. Seven of the unknown events are discussed in details. In the 9 months of data analyzed, we did not detect any ultra-long GRB. However, we did identify candidate long GRBs, some of which exhibited softer spectra compared to typical GRBs. Based on these findings, the next logical progression is to apply the framework to the complete 15 years of Fermi data. In future work, our focus will be on improving the prediction capabilities of the neural network. We plan to explore the use of Recurrent Neural Networks (RNNs) and expand the training dataset to achieve a smoother signal, particularly in regions affected by data clipping, such as the South Atlantic Anomaly (SAA). By reducing the time binning, we aim to enhance the detection of shorter events with higher precision. Additionally, we will integrate explainability methods into our framework to enable users to understand and interpret specific predictions made by the neural network. This will provide insights into the underlying factors and features that contribute to the predictions. It will also facilitate the debugging process, enabling us to identify and address any issues or biases within the neural network.
Notes
To avoid noise count rates and calculate the significance around the event’s peak, only count rates greater than a quantile-based threshold were included in the integral.
References
Gehrels, N., Chincarini, G., Giommi, P.E., Mason, K., Nousek, J., Wells, A., White, N., Barthelmy, S., Burrows, D., Cominsky, L., et al.: The swift gamma-ray burst mission. Astrophys. J. 611(2), 1005 (2004)
Winkler, C., Di Cocco, G., Gehrels, N., Giménez, A., Grebenev, S., Hermsen, W., Mas-Hesse, J., Lebrun, F., Lund, N., Palumbo, G., et al.: The integral mission. Astron. Astrophys. 411(1), 1–6 (2003)
Meegan, C., Lichti, G., Bhat, P., Bissaldi, E., Briggs, M.S., Connaughton, V., Diehl, R., Fishman, G., Greiner, J., Hoover, A.S., et al.: The fermi gamma-ray burst monitor. Astrophys. J. 702(1), 791 (2009)
Tavani, M., Barbiellini, G., Argan, A., Boffelli, F., Bulgarelli, A., Caraveo, P., Cattaneo, P., Chen, A., Cocco, V., Costa, E., et al.: The agile mission. Astron. Astrophys. 502(3), 995–1013 (2009)
Woosley, S.E.: Gamma-ray bursts from stellar mass accretion disks around black holes. Astrophys. J. 405, 273–277 (1993)
Woosley, S., Bloom, J.: The supernova-gamma-ray burst connection. Annu. Rev. Astron. Astrophys. 44, 507–556 (2006)
Berger, E.: Short-duration gamma-ray bursts. Annu. Rev. Astron. Astrophys. 52, 43–105 (2014)
Granot, J., Guetta, D., Gill, R.: Lessons from the short grb 170817a: the first gravitational-wave detection of a binary neutron star merger. Astrophys. J. Lett. 850(2), 24 (2017)
Pian, E.: Mergers of binary neutron star systems: a multimessenger revolution. Front. Astron. Space Sci. 7, 609460 (2021)
Stratta, G., Ciolfi, R., Amati, L., Bozzo, E., Ghirlanda, G., Maiorano, E., Nicastro, L., Rossi, A., Vinciguerra, S., Frontera, F., et al.: Theseus: A key space mission concept for multi-messenger astrophysics. Adv. Space Res. 62(3), 662–682 (2018)
Bernardini, M.G., Cordier, B., Wei, J.: The svom mission. Galaxies 9(4), 113 (2021)
e-ASTROGAM Collaboration, De Angelis, A., Tatischeff, V., Tavani, M., Oberlack, U., Grenier, I., Hanlon, L., Walter, R., Argan, A., von Ballmoos, P., et al.: The e-astrogam mission: Exploring the extreme universe with gamma rays in the mev-gev range. Exp. Astron. 44, 25–82 (2017)
Caputo, R., Perkins, J., Racusin, J., Ajello, M., Kierans, C., Fleischhack, H., Negro, M., Zhang, H., Venters, T., Cannady, N., et al.: Amego-x mission overview. AAS/High Energy Astrophys. Div. 54(3), 404–03 (2022)
Bloser, P.F., Murphy, D., Fiore, F., Perkins, J.: Cubesats for gamma-ray astronomy [book chapter]. Technical report, Los Alamos National Lab.(LANL), Los Alamos, NM (United States) (2022)
Fiore, F., Werner, N., Behar, E.: Distributed architectures and constellations for \(\gamma \)-ray burst science. Galaxies 9(4), 120 (2021)
Fiore, F., Burderi, L., Lavagna, M., Bertacin, R., Evangelista, Y., Campana, R., Fuschino, F., Lunghi, P., Monge, A., Negri, B., et al.: The hermes-technologic and scientific pathfinder. In: Space Telescopes and Instrumentation 2020: Ultraviolet to Gamma Ray, vol. 11444, pp. 214-228 (2020). SPIE
Fiore, F., Werner, N., Behar, E.: Distributed Architectures and Constellations for \(\gamma \)-ray Burst Science. Galaxies 9(4), 120 (2021). https://doi.org/10.3390/galaxies9040120. arXiv:2112.08982
Auchettl, K., Trenti, M., Thomas, M., Fiore, F.: The spirit mission: Multiwavelength detection and follow-up of cosmic explosions with an Australian space telescope. AAS/High Energy Astrophys. Div. 54(3), 305–02 (2022)
Thomas, M., Trenti, M., Sanna, A., Campana, R., Ghirlanda, G., Řípa, J., Burderi, L., Fiore, F., Evangelista, Y., Amati, L., et al.: Localisation of gamma-ray bursts from the combined spirit+ hermes-tp/sp nano-satellite constellation. Publ. Astron. Soc. Aust. 40, 008 (2023)
Fuschino, F., Campana, R., Labanti, C., Evangelista, Y., Feroci, M., Burderi, L., Fiore, F., Ambrosino, F., Baldazzi, G., Bellutti, P., et al.: Hermes: An ultra-wide band x and gamma-ray transient monitor on board a nano-satellite constellation. Nucl. Instrum. Meth. Phys. Res. Sect. A: Accel Spectrom. Detectors Assoc. Equip. 936, 199–203 (2019)
Evangelista, Y., Fiore, F., Fuschino, F., Campana, R., Ceraudo, F., Demenev, E., Guzman, A., Labanti, C., La Rosa, G., Fiorini, M., et al.: The scientific payload on-board the hermes-tp and hermes-sp cubesat missions. In: Space Telescopes and Instrumentation 2020: Ultraviolet to Gamma Ray, vol. 11444, p. 114441 (2020). International Society for Optics and Photonics
Fiore, F., Guzman, A., Campana, R., Evangelista, Y.: HERMES-Pathfinder. 2210-13842 (2022). https://doi.org/10.48550/arXiv.2210.13842. arXiv:2210.13842
Evangelista, Y., Fiore, F., Campana, R., Ceraudo, F., Della Casa, G., Demenev, E., Dilillo, G., Fiorini, M., Grassi, M., Guzman, A., Hedderman, P., Marchesini, E.J., Morgante, G., Mele, F., Nogara, P., Nuti, A., Piazzolla, R., Pliego Caballero, S., Rashevskaya, I., Russo, F., Sottile, G., Labanti, C., Baroni, G., Bellutti, P., Bertuccio, G., Cao, J., Chen, T., Dedolli, I., Feroci, M., Fuschino, F., Gandola, M., Gao, N., Ficorella, F., Malcovati, P., Picciotto, A., Rachevski, A., Santangelo, A., Tenzer, C., Vacchi, A., Wang, L., Xu, Y., Zampa, G., Zampa, N., Zorzi, N.: Design, integration, and test of the scientific payloads on-board the HERMES constellation and the SpIRIT mission. In: den Herder, J.-W.A., Nikzad, S., Nakazawa, K. (eds.) Space Telescopes and Instrumentation 2022: Ultraviolet to Gamma Ray. Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, vol. 12181, p. 121811 (2022). https://doi.org/10.1117/12.2628978
Paciesas, W.S., Meegan, C.A., Pendleton, G.N., Briggs, M.S., Kouveliotou, C., Koshut, T.M., Lestrade, J.P., McCollough, M.L., Brainerd, J.J., Hakkila, J., et al.: The fourth batse gamma-ray burst catalog (revised). Astrophys. J. Suppl. Ser. 122(2), 465 (1999)
Feroci, M., Frontera, F., Costa, E., Dal Fiume, D., Amati, L., Bruca, L., Cinti, M.N., Coletta, A., Collina, P., Guidorzi, C., et al.: In-flight performances of the bepposax gamma-ray burst monitor. In: EUV, X-Ray, and Gamma-Ray Instrumentation for Astronomy VIII, vol. 3114, pp. 186-197 (1997). SPIE
Kommers, J.M.: Faint gamma-ray bursts and other high-energy transients detected with batse. PhD thesis, Massachusetts Institute of Technology (1999)
Kocevski, D., Burns, E., Goldstein, A., Dal Canton, T., Briggs, M., Blackburn, L., Veres, P., Hui, C., Hamburg, R., Roberts, O., et al.: Analysis of sub-threshold short gamma-ray bursts in fermi gbm data. The Astrophys. J. 862(2), 152 (2018)
Hui, C., Briggs, M., Veres, P., Hamburg, R.: Finding untriggered gamma-ray transients in the fermi gbm data. In: Proceedings of the 7th International Fermi Symposium, p. 129 (2017)
Biltzinger, B., Kunzweiler, F., Greiner, J., Toelge, K., Burgess, J.M.: A physical background model for the fermi gamma-ray burst monitor. Astron. Astrophys. 640, 8 (2020)
Sadeh, I.: Deep learning detection of transients. (2019). arXiv:1902.03620
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Bishop, C.M.: Neural networks and their applications. Rev. Sci. Instrum. 65(6), 1803–1832 (1994)
Bebis, G., Georgiopoulos, M.: Feed-forward neural networks. IEEE Potentials 13(4), 27–31 (1994)
Levan, A., Tanvir, N.R., Starling, R., Wiersema, K., Page, K., Perley, D., Schulze, S., Wynn, G.A., Chornock, R., Hjorth, J., et al.: A new population of ultra-long duration gamma-ray bursts. Astrophys. J. 781(1), 13 (2013)
Gendre, B., Joyce, Q., Orange, N., Stratta, G., Atteia, J., Boër, M.: Can we quickly flag ultra-long gamma-ray bursts? Mon. Not. R. Astron. Soc. 486(2), 2471–2476 (2019)
Dagoneau, N., Schanne, S., Atteia, J.-L., Götz, D., Cordier, B.: Ultra-long gamma-ray bursts detection with svom/eclairs. Experimental Astronomy 50(1), 91–123 (2020)
Boer, M., Gendre, B., Stratta, G.: Are ultra-long gamma-ray bursts different? Astrophys. J. 800(1), 16 (2015)
Gendre, B., Stratta, G., Atteia, J., Basa, S., Boër, M., Coward, D., Cutini, S., d’Elia, V., Howell, E., Klotz, A., et al.: The ultra-long gamma-ray burst 111209a: the collapse of a blue supergiant? Astrophys. J. 766(1), 30 (2013)
Stratta, G., Gendre, B., Atteia, J., Boër, M., Coward, D., De Pasquale, M., Howell, E., Klotz, A., Oates, S., Piro, L.: The ultra-long grb 111209a. ii. prompt to afterglow and afterglow properties. Astrophys. J. 779(1), 66 (2013)
Zou, L., Zhou, Z.-M., Xie, L., Zhang, L.-L., Lü, H.-J., Zhong, S.-Q., Wang, Z.-J., Liang, E.-W.: Magnetar as central engine of gamma-ray bursts: Central engine-jet connection, wind-jet energy partition, and origin of some ultra-long bursts. Astrophys. J. 877(2), 153 (2019)
Gompertz, B., Fruchter, A.: Magnetars in ultra-long gamma-ray bursts and grb 111209a. Astrophys. J. 839(1), 49 (2017)
Gruber, D., Krühler, T., Foley, S., Nardini, M., Burlon, D., Rau, A., Bissaldi, E., Von Kienlin, A., McBreen, S., Greiner, J., et al.: Fermi/gbm observations of the ultra-long grb 091024-a burst with an optical flash. Astron. Astrophys. 528, 15 (2011)
Bissaldi, E., von Kienlin, A., Lichti, G., Steinle, H., Bhat, P.N., Briggs, M.S., Fishman, G.J., Hoover, A.S., Kippen, R.M., Krumrey, M., et al.: Ground-based calibration and characterization of the fermi gamma-ray burst monitor detectors. Exp. Astron. 24(1–3), 47–88 (2009)
Campana, R., Fuschino, F., Evangelista, Y., Dilillo, G., Fiore, F.: The hermes-tp/sp background and response simulations. In: Space Telescopes and Instrumentation 2020: Ultraviolet to Gamma Ray, vol. 11444, pp. 817-824 (2020). SPIE
Dilillo, G., Zampa, N., Campana, R., Fuschino, F., Pauletta, G., Rashevskaya, I., Ambrosino, F., Baruzzo, M., Cauz, D., Cirrincione, D., et al.: Space applications of gagg: Ce scintillators: a study of afterglow emission by proton irradiation. Nuclear Instrum. Meth. Phys. Res. Sect. B: Beam Interact. Mater. Atoms 513, 33–43 (2022)
Fitzpatrick, G., McBreen, S., Connaughton, V., Briggs, M.: Background estimation in a wide-field background-limited instrument such as fermi gbm. In: Space Telescopes and Instrumentation 2012: Ultraviolet to Gamma Ray, vol. 8443, pp. 965-973 (2012). SPIE
Goldstein, A., Cleveland, W.H., Kocevski, D.: Fermi GBM Data Tools: v1.1.0 (2021). https://fermi.gsfc.nasa.gov/ssc/data/analysis/gbm
Ward, K., Dilillo, G., Eckley, I., Fearnhead, P.: Poisson-focus: An efficient online method for detecting count bursts with application to gamma ray burst detection. (2022). arXiv:2208.01494
Romano, G., Eckley, I.A., Fearnhead, P., Rigaill, G.: Fast online changepoint detection via functional pruning cusum statistics. J. Mach. Learn. Res. 24, 1–36 (2023)
Von Kienlin, A., Meegan, C., Paciesas, W., Bhat, P., Bissaldi, E., Briggs, M., Burns, E., Cleveland, W., Gibby, M., Giles, M., et al.: The fourth fermi-gbm gamma-ray burst catalog: A decade of data. Astrophys. J. 893(1), 46 (2020)
Hastie, T., Tibshirani, R., Friedman, J.H., Friedman, J.H.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction vol. 2. Springer, ??? (2009)
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448-456 (2015). PMLR
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(56), 1929–1958 (2014)
Ruder, S.: An overview of gradient descent optimization algorithms. CoRR (2016). arxiv:1609.04747
FSSC, F.S.S.C.: Fermi Gamma-Ray Space Telescope Project, Science Data Products File Format Document (FFD) GLAST-GS-DOC-0001. (2019). https://fermi.gsfc.nasa.gov/ssc/library/support/Science_DP_FFD_RevA.pdf
Campana, R., Feroci, M., Del Monte, E., Mineo, T., Lund, N., Fraser, G.W.: Background simulations for the large area detector onboard loft. Exp. Astron. 36, 451–477 (2013)
Oceanic, S.W.P.C.-N., Administration, A.: Solar Cycle 25 Forecast Update. https://www.swpc.noaa.gov/news/solar-cycle-25-forecast-update Accessed 09 Dec. 2019
Biesecker, D.A., Upton, L.: Solar cycle 25 consensus prediction update. In: AGU Fall Meeting Abstracts, vol. 2019, pp. 13-03 (2019)
Bhowmik, P., Nandy, D.: Prediction of the strength and timing of sunspot cycle 25 reveal decadal-scale space environmental conditions. Nat. Commun. 9(1), 1–10 (2018)
Hathaway, D.H., Wilson, R.M., Reichmann, E.J.: The shape of the sunspot cycle. Sol. Phys. 151(1), 177–190 (1994)
Upton, L.A., Hathaway, D.H.: An updated solar cycle 25 prediction with aft: The modern minimum. Geophys. Res. Lett. 45(16), 8091–8095 (2018)
Von Kienlin, A., Meegan, C.A., Paciesas, W.S., Bhat, P., Bissaldi, E., Briggs, M.S., Burgess, J.M., Byrne, D., Chaplin, V., Cleveland, W., et al.: The second fermi gbm gamma-ray burst catalog: the first four years. Astrophys. J. Suppl. Ser. 211(1), 13 (2014)
Zou, H., Li, C., Zong, Q., Parks, G.K., Pu, Z., Chen, H., Xie, L., Zhang, X.: Short-term variations of the inner radiation belt in the south atlantic anomaly. J. Geophys. Res. Space Phys. 120(6), 4475–4486 (2015)
Bhat, P.N., Meegan, C.A., Von Kienlin, A., Paciesas, W.S., Briggs, M.S., Burgess, J.M., Burns, E., Chaplin, V., Cleveland, W.H., Collazzi, A.C., et al.: The third fermi gbm gamma-ray burst catalog: the first six years. Astrophys. J. Suppl. Ser. 223(2), 28 (2016)
Goldstein, A., Fletcher, C., Veres, P., Briggs, M.S., Cleveland, W.H., Gibby, M.H., Hui, C.M., Bissaldi, E., Burns, E., Hamburg, R., et al.: Evaluation of automated fermi gbm localizations of gamma-ray bursts. Astrophys. J. 895(1), 40 (2020)
Acknowledgements
We thank Daniela Cirrincione, Giovanni Della Casa, Simone Monzani and Nicola Zampa for the support and constructive feedbacks during the HERMES-Udine meetings. We thank the anonymous reviewers for their insightful criticism, which has helped us improve the quality and clarity of our paper. Special thanks to Daniele Regoli for the useful recommendations in the introduction and method sections.
Funding
Open access funding provided by Istituto Nazionale di Astrofisica within the CRUI-CARE Agreement. This research acknowledge support from the European Union Horizon 2018 and 2020 Research and Innovation Frame-work Programme under grant agreements HERMES-Scientific Pathfinder n. 821896 and AHEAD2020 n. 871158, by ASI INAF Accordo Attuativo n. 2018-10-HH.1.2020 HERMES-Technologic Pathfinder Attivita’ scientifiche and Accordo Attuativo INAF-ASI 2022-XX-HH.0 "HERMES Pathfinder - Operazioni e sfruttamento scientifico”. We also aknownledge the support of the INAF RSN-5 mini-grant 1.05.12.04.05, "Development of novel algorithms for detecting high-energy transient events in astronomical time series".
Author information
Authors and Affiliations
Contributions
R.C., G.D. conceived the main ideas of the paper with the help of F.F. and A.V.; R.C., K.W. and G.D. developed the methods; R.C. and G.D. carried out the experiments; R.C. and G.D wrote the manuscript with the help of F.F and A.V.; E.B. checked and reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Conflict interest
The authors declare that they have no confict of interest.
Appendices
Appendix A: Background estimate for GRB 091024
To demonstrate the potential of the background estimator in the presence of a long event, a background estimation is performed in a period containing the ultra-long GRB 091024 [42], for which a similar evaluation is provided in [29].
In Fig. 11 are shown detectors n0, n6 and n8 in the three energy band specified in Table 1. The data and background estimation of a Neural Network trained and tested during a three-month period, from September 1 to November 30, 2009, are presented in black and red, respectively. The dataset consists of 1.63 million of samples and the hyperparameters are the same used in Section 2.1 except for the learning rate
The event emerges clearly from the residuals of all the detectors in range r1 and r2, in detector n0 and range r2 it is still visible a peak probably belonging to the end of it. In detector n6 range r0, in the first part of the time series before the peaks of the event, the background estimation underestimates the foreground (data observed). This could be due to a too short period of training dataset, a non optimal parameter settings of the NN, a different event such as Local Particles or, more interestingly, the first part of the GRB, where photon count rates were too low to be detected due to background variability.
Observed and background estimate count rates around the event GRB 091024. From left to right the plots refer respectively to range r0, r1, r2, from top to bottom the plots refer respectively to detectors n0, n6, n8. This figure can be compared with the background estimation around the GBM trigger time of GRB 091024 in Biltzinger et al. (2020) [29]
Appendix B: Localization
For the standard reference for the localization of events found by GBM, look [65]. In this work the localization is done by a simple geometric reasoning, but in future we hope to use more sophisticated algorithm of localization. To optimise the function loss it is employed a particle swarm optimiserFootnote 4.
Consider two vectors in the equatorial coordinates \(\psi _d = (ra_d, dec_d)\) and \(\psi _s = (ra_s, dec_s)\), respectively the pointing of a detector and the localization of the event source. The incidence intensity is modeled as the cosine between the angle \(cos(\psi _d, \psi _s)\) is:
If the angle of incidence is grater than \(\pi /2\) than the incident intensity must be set to 0. Finally we have (B1)
The loss to optimise in (B2), where i is a particular detector in D detectors (in our case 12). The energy range chosen is the one with the biggest residuals among detectors/energy ranges, then the count rates corresponding to the timestamp of the maximum value is given to the loss (B2) and minimized.
where \(\psi _s\) and \(counts_s\) are the unknown variables.
Appendix C: Solar minima and maxima figures
The background estimation in year 2014 for detector n5 (Sun-facing) in the energy range r0. The count rates are averaged over a bin time corresponding to 1 period orbit (96m). A zoom-in is applied to avoid the outliers shown in Fig. 14
Appendix D: Interesting events
Appendix E: Catalog table
Appendix F: On the hyperparameters choice of the NN
The choice of hyperparameters and settings in Section 2.1 was the result of a trial and error process to fit the neural networks that performed well over the three mentioned periods in Section 4.1. It is important to note that due to the time-consuming nature of testing all possible combinations of hyperparameters, a limited number of combinations were selected based on a sense of practice and intuition.
For simplicity, here we report the final configuration settings: the first layers had 2048 neurons, the third layer had 1024 neurons, dropout was set to 0.02, and the learning rate (denoted as \(\eta \)) was varied according to (1), with \(\beta _1 = 0.9\), \(\beta _2 = 0.99\), and \(\epsilon = 10^{-7}\). The models were trained for 64 epochs with a batch size of 2048, early stopping was applied after 32 epochs, the loss function used was MAE, and events known from the training set were removed.
The choice of a larger number of neurons (2048) for the first layers was made to ensure minimal residual in terms of MAE. Doubling the number of neurons did not lead to a significant increase in performance.
Initially, a constant learning rate of 0.0008 was chosen. However, it was observed that especially in the first epochs, a higher learning rate was needed to reduce the loss function and avoid getting stuck in a local minimum. In the later steps, a learning rate of 0.0004 was used to reach convergence and achieve a stable loss function. The values of \(\beta _1 = 0.9\), \(\beta _2 = 0.99\), and \(\epsilon = 10^{-7}\) are similar to the default values for the optimizer of the Neural Networks.
Regarding the number of epochs, it was decided to stop training at 64 epochs because, along with a batch size of 2048, the neural network tended to achieve good and stable performance within this range.
Dropout, introduced by Srivastava et al. (2014) [53], is a technique that randomly deactivates neurons according to a user-defined probability. The dropout parameter serves as a regularizer, preventing overfitting and ensuring that the loss function does not become stuck at a high MAE. The dropout probabilities tested are 0.2, 0.02, and 0.0002. In all cases, the neural network converges with similar MAE. But with dropout 0.02, the model demonstrates stable and good performance on validation set even as early as the 4th epoch.
1.1 F.1 MAE vs MSE
The choice of the appropriate loss function is indeed crucial in many applications of data science. In this case, we would like to emphasize the properties of MAE and MSE and provide a mathematical justification for their use.
It can be demonstrated that a regressor using MSE and MAE approximates the conditional mean \(E(Y \mid X = x)\) and the conditional median \(\textit{median}(Y \mid X = x)\), respectively, see Equations 2.13 and 2.18 in Hastie et al., (2009) [51]. Therefore, when employing MAE, the estimator’s output behaves similar to the median, making it robust against outliers. Conversely, when using MSE, the output behaves more like the mean, which is not robust against outliers.
In the case where the training set contains significant event count rates, such as solar flares similar to Fig. 7b, the estimator should treat these events as outliers since they do not belong to the common background dynamics. Even after normalizing the count rates by subtracting the mean and dividing by the standard deviation, the scaled dataset still retains the same outliers, preserving the proportion among the count rates.
An empirical example showcasing the robustness of the method can be found in Section 4.2. Figure 24 presents two training phases including events from the Fermi-GBM catalog for (a) MAE and (b) MSE. Only the run with MAE (a) exhibits good convergence, and Fig. 25a showcases an example of prediction using MAE. In (b), the neural network has converged, but the validation loss is higher and noisier than the training loss, indicating poor generalization (overfitting). Figure 25b displays MSE predictions over a short period where the approximation of the background dynamics shows artifacts and inconsistencies in respect to MAE. It is worth noting that even when other hyperparameters were adjusted, this effect persisted, leading us to choose MAE as the preferred loss function.
Based on these plots, it becomes apparent why 64 epochs were sufficient for training the neural network, as the desired convergence and performance were achieved.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Crupi, R., Dilillo, G., Bissaldi, E. et al. Searching for long faint astronomical high energy transients: a data driven approach. Exp Astron 56, 421–476 (2023). https://doi.org/10.1007/s10686-023-09915-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10686-023-09915-7