Generalized Lomb-Scargle Analysis of 22 years of Super-Kamiokande solar 8 B neutrino data

We apply the generalized Lomb-Scargle periodogram to 22 years data of solar 8 B neutrino fluxes detected by Super-Kamiokande. The primary motivation of this work was to check if the sinusoidal modulation at a frequency of 9.43/year (with a period of 38 days), which we had found to be marginally significant with the first five years of Super-K data, persists, with the accumulated data. We use four different metrics for the calculation of significance of any peaks in the Lomb-Scargle periodogram, which could be indicative of periodicities. We do not find any evidence for periodicity at the aforementioned frequency or any other frequency with the updated data. Therefore, the observed peak at 9.43/year with the first five years of Super-Kamiokande data was only a statistical fluctuation and its significance is negligible with the updated data.


I. INTRODUCTION
For more than two decades, Sturrock and collaborators have argued that the 8 B solar neutrino flux seen in Super-Kamiokande-I from 1996-2001 [1] contains a sinusoidal modulation at a frequency of 9.43/year, which corresponds to a period of around 38 days (See Ref. [2] and references therein).They asserted that this peak is due to the synodic rotation of the solar core, for which the sidereal rotation rate is around 10.43/year [3,4].The first independent analysis of the Super-K data reported p-values for this peak ranging from 0.007 to 0.019, depending on the analysis technique used [5].Another work found p-values from 0.018 to 0.12, depending on whether one incorporates the asymmetric errors or not [6].However, these results were at odds with the analysis done by the Super-K collaboration, which did not find any evidence for periodicity at any frequency [1].
In order to investigate this issue, we applied the generalized Lomb-Scargle periodogram [7][8][9] to Super-K-I data (binned in intervals of 5-days) [10].We then calculated the p-value using the method proposed in [8], as well as using Bayesian information criterion (BIC) [11].We were able to confirm the peak (previously observed in works by Sturrock and collaborators) at 9.43/year with a p−value of 0.015 (significance of 2.2 σ), having a BIC value close to 5, pointing to marginal significance using qualitative "strength of evidence" rules [11].This peak was also confirmed in other works [12,13].The amplitude of this modulation at a period of 38 days is equal to 6.9%.Most recently, it has been shown that the significance of this peak is due to the fortuitous alignment of six data points [14].If these data points are excluded, the amplitude gets reduced to 5.3% thereby reducing the significance of the peak.Therefore, Ref. [14] has asserted that the aforementioned peak is not real and only a statistical fluctuation.Nevertheless, the only way to resolve this imbroglio is to redo the same procedure with additional data.
Another reason for possible sinusoidal modulations in the solar neutrino flux could be due to time variations in the solar magnetic field, if the neutrino has a non-zero magnetic moment, because of Resonant Spin Flavor precession [15][16][17].Finally, a periodic variation in the solar core temperature could also induce a periodicity in the solar neutrino fluxes [18].Therefore observing such a periodic modulation could enable us to gain insights on a variety of physical processes in the solar interior and also some of the fundamental properties of neutrinos.
Twenty years after the first search for periodicities, the Super-K collaboration has recently carried out another search for periodicities with a livetime of around 22 years [17].Therefore, this data would provide a "smoking gun" test on whether the tantalizing hints for a periodic signal at 38 days seen in the first five years of data were only a fluctuation or signatures of a real sinusoidal signal.However, no evidence for periodicity was found in this analysis.This data has also been made publicly available.
In this work, we again apply the generalized Lomb-Scargle periodogram (as our previous work [10]) to the aforementioned Super-K data [17].We evaluate the statistical significance using four independent methods, similar to our recent works on searching for periodicities in nuclear β-decay rates [19][20][21].This manuscript is structured as follows.We provided a brief account of the generalized Lomb-Scargle periodogram in Sect.II.We recap the latest Super-K search for periodicity using more than 20 years of data in Sect.III.Our analysis and results can be found in Sect.IV.
We conclude in Sect.V.

II. GENERALIZED LOMB-SCARGLE PERIODOGRAM
The Lomb-Scargle (L-S) [7,8,[22][23][24] periodogram is a widely used robust technique to look for periodicities in unevenly sampled data.The main goal of the L-S periodogram is to determine the frequency (f ) of a periodic sinusoidal signal in a time-series (y(t)) as follows: The L-S periodogram calculates the power as a function of frequency, from which one can assess the statistical significance for any frequency.
For this analysis, we use the generalized (or floating-mean) L-S periodogram [9,25].The main difference with respect to the ordinary L-S periodogram is that an arbitrary offset gets added to the mean values.More details on the differences between the two implementations can be found in [22,23].The generalized L-S periodogram has been shown to be more sensitive than the normal one, for detecting peaks when the data sampling overestimates the mean [9,22,26].To determine the significance of any peak in the L-S periodogram, we must calculate its false alarm probability or p-value.A large number of metrics have been constructed to estimate the p-value of peaks in the L-S periodogram [8,22,27,28].We now briefly describe these myriad metrics, which we label based on the command-line options in Python, which are used to calculate these p-values:

• Baluev
This method uses extreme value statistics for stochastic process, to compute an upper-bound of the p−value in case of no aliasing.The analytical expression for the p−value using this method can be found in [22,29].

• Bootstrap
This method makes use of non-parametric bootstrap resampling [22].It applies L-S periodograms on synthetic data constructed at the same observation times as the real data.The bootstrap is the most robust estimate of the p-value, as it makes minimal assumptions about the periodogram distribution, and the observed times also fully account for survey window and dead-time effects [22].However, the bootstrap method does not correctly account for correlated noise in the observations [24].One also needs a large number of bootstrap resamples to compute the p−value with very good accuracy.We use 1000 bootstrap samples which can provide p-values with an accuracy of about 1%.We also note that for our analysis, we are using binned data, which consists of observations averaged over a five-day period, and hence we are not fully incorporating the detector livetime and dead-time effects.

• Davies
This method is similar to Baluev, but is not accurate at large p-values, where it shows values greater than 1 [30].Nevertheless, for completeness we also calculate the p-value using this method.

• Naive
This method is based on the assertion that well-separated areas in the periodogram are independent of each other.The total number of independent frequencies is dependent on the sampling rate and observation duration [22].
We note that all the aformentioned p-values are global p-values that evaluate the significance of peaks in the periodogram after accounting for the "look elsewhere effect": the trials factors associated with the fact that many frequencies are being searched at once [31].However, for two specific frequencies (9.43/year and 1/year), we also calculate the local p-value using the expression in Table 1 of [29].This local p-value has been implemented using the single option in the L-S module provided in astropy.Once the p−value is known, one can evaluate the significance or Z-score [32].The smaller the p-value, the more significant is the peak.A rule of thumb for any peak to be deemed interesting is that p-value should be less than 0.05.However for a peak to be statistically significant, its Z−score must be greater than 5σ [33], which corresponds to p-value < 10 −7 .

III. RECAP OF SK23
We briefly describe the latest search for periodicity carried out by the Super-K collaboration using more than 20 years of 8 B solar neutrino data from 1996-2018 [17].Super-K is a 50 kiloton water Cherenkov detector located in the Kamioka mine in Japan which detects neutrinos with energies from MeV range [34] to over a TeV [35], which has been taking data since 1996.Super-K has produced physics and astrophysics results from a wide range of topics from neutrino oscillations [36] to dark matter [37].The total livetime used for the analysis of solar neutrino data until 2018 is equal to 5,804 days combining four different phases of data taking spanning about 22 years.Super-K detects about 20 solar neutrino interactions per day.The total dataset was divided into 1343 time bins, where the average width of each bin is around five days.A search for perodicities was done using the Maximum likelihood method (by incorporating the energy information in addition to the fluxes) as well as the L-S periodogram following the prescription in [38].The L-S analysis was done by looking for 100,000 frequencies from 10 −6 /day to 0.2/day.No significant periodicities apart from the annual modulation due to the revolution of the Earth around the Sun were found.As a supplement to this paper, the data for 5-day interval fluxes covering the above observing period have been made publicly available.The publicly released dataset consists of mean time, start and end time of the observed data, the measured solar neutrino flux along with its upper and lower flux errors.However, this data does not include corrections due to the varying Earth-Sun distance.Unlike the Super-K-I dataset, these correction factors due to the eccentricity of the Earth's orbit around the Sun were not provided along with the dataset at the time of writing.So we obtained these correction factors by calculating the average Earth-Sun distance binned in 1-day intervals (D) for each 5-day bin.The distance to the Sun was calculated using the astropy [39] module.We then scaled the raw flux and the errors by D 2 after normalizing the distance by 1 AU.This plot of both the uncorrected and corrected fluxes as a function of time since the start of Super-K can be found in Fig. 1.Data used for this plot has been obtained from [17].For brevity, we have ignored the X-axis bin width in the above plots.

IV. ANALYSIS AND RESULTS
We now apply a generalized L-S periodogram to this dataset.We used the L-S implementation in astropy.timeseries [39] module in Python.We provided the mid-point of each time bin, measured flux and the flux errors as inputs to the L-S periodogram.The flux errors were obtained from an average of the asymmetric error bars.However, as a sanity check we also redid the analyses by considering the larger values among the errors for each data point, but the results are comparable to those obtained by considering the average of the errors and are not reported here.The recommended frequency resolution and maximum frequency up to which the generalized L-S method can robustly detect sinusoidal modulations are given by the reciprocal of five times the total duration of the dataset and five times the mean Nyquist equivalent frequency, respectively [22].For the Super-K dataset, this default frequency resolution is equal to 0.0091/year, which is used for our work.Based on the above recommendation, the maximum frequency up to which the L-S periodogram would be sensitive to any potential peaks is equal to 152.8/year.However for brevity, we only show the results for frequencies up to 20/year (similar to our previous work), since previous searches had shown a statistically significant result only at 9.43/year [10] and the frequencies associated with solar rotation are between 8-14/year.However, we also checked if the updated data exhibited any statistically significant peaks for frequencies greater than 20/year and up to 70/year (which is equal to the width of each time bin).Since we did not find any peaks with large statistical significance above a frequency of 20/year, we do not show any results for the same.We followed the same normalization convention for the L-S periodogram as in our previous works [19][20][21], where the L-S power can take values between zero and one.In the appendix, we also show the results of the same analyses using only the first five years of Super-K data until July 2001.
The L-S power as a function of frequency for both the uncorrected as well as the corrected 8 B neutrino fluxes are found in Fig. 2. We see a maximum peak for the uncorrected flux at a frequency of 1/year.However, this peak is not the strongest, once we rescale the fluxes by the square of the distance.We calculated the p-value for the first four significant peaks.These include the frequency of 9.39/year, which is close to 9.43/year previously found to be marginally significant in [10].This table containing the L-S powers and the p-values (using all the four methods outlined in Sect.II) for the top four frequencies with the largest L-S powers can be found in Tables I and II, respectively.We find that for both the corrected as well as uncorrected flux, we do not find any statistically significant peaks.All the observed p-values (using any method) are greater than 0.4.For the uncorrected flux, the maximum power is seen at a frequency of 1.06/year with the lowest p-value of 0.47 using the Naive method.For the flux, scaled by the square of the distance, the maximum power is seen at a frequency of 9.39/year (which is close to the frequency of 9.43/year, which we had previously found in Super-K-I data [10]).However, its statistical significance is negligible (with the minimum p-value of 0.71).This is in contrast to the value of 0.015 we had found for Super-K-1.Therefore, we conclude that with the latest updated data, the marginally significant peak close to the frequency of 9.43/year using the first five years of data has disappeared.Since no other significant peaks were found, we therefore assert that there are no periodic sinusoidal signals at any frequencies in the 8 B solar neutrino flux.We also applied the generalized L-S periodogram to all the Super-K data post July 2001 (which was not included in the earlier analysis) and calculated the local p−value for the frequency of 9.39/year.We find the p-values at this frequency to be equal to 0.074 (1.4σ) and 0.056 (1.6σ) for the uncorrected and corrected fluxes, respectively.Therefore, the local p−values are not statistically significant for the frequency of 9.39/year, when analyzing the data after the phase-I of Super-K.For the frequency of 1/year, the local p−value is equal to 0.001 and 0.54 for the uncorrected and corrected flux, respectively using the full 22 years of Super-K data.Therefore, the uncorrected flux corresponds to a local significance of 3.1σ.Once we correct for the Earth's eccentricity, the significance becomes negligible.
To summarize, we do not find evidence for periodicities at any frequency using 22 years of Super-K data, in accord with the results found in [17].TABLE II: L-S power and p-value (last four columns) for the rescaled 8 B Super-K flux after correcting for the eccentricity of the Earth's orbit, using four different methods for the four frequencies showing the largest L-S powers in descending order.Similar to the uncorrected flux, all the p-values are greater than 0.5, implying that none of the peaks are statistically significant, and there is no evidence for sinusoidal modulations in the data.

V. CONCLUSIONS
Multiple groups have found evidence for sinusoidal modulations in the 8 B solar neutrino flux observed using the first five years of Super-K data at a frequency of around 9.43/year, corresponding to a period of 38 days.These peaks were asserted to be due to the synodic rotation of the solar core.Our own analysis (in 2016) of this data using the generalized L-S periodogram found a p-value of around 0.015.In November 2023, the Super-K collaboration updated its results for periodicity searches using 22 years of data using two independent methods, one of which includes the (normal) L-S analysis.This work did not find evidence for any statistically significant peak (apart from the variation due to Earth's orbit around the Sun) [17].The dataset used for this analysis was also made publicly available.
We carried out an independent search for periodicity with the same data using the generalized L-S periodogram to ascertain if the updated data again contains a modulation at or near 9.43/year, and if its detection significance is enhanced with five times more data.We analyzed both the raw solar neutrino fluxes and also the fluxes rescaled by the square of the distance between the Earth and the Sun.We estimated the p-values using four independent methods.Our plots for the generalized L-S power as a function of frequency can be found in Fig. 2. The p-values we found for the top four frequencies in descending order of their L-S powers can be found in Table I and Table II, respectively.We no longer find any statistically significant peak at a frequency close to 9.43/year.For the corrected flux, the minimum p-value we found was equal to 0.71 (at a frequency of 9.39/year), which is not significant.Therefore, we conclude that the entire 22 years of Super-K data no longer contains any signatures of the sinusoidal modulations close to a frequency of 9.43/year (period of 38 days) or any other frequency.Our results are also in accord with the corresponding analyses carried out by the Super-K Collaboration.In the spirit of open science, we have made our analysis codes for this work publicly available, which can be found at https://github.com/DarkWake9/Project-LP.

Flux × D 2 FIG. 1 :
FIG.1:The Super-K 8 B solar neutrino flux as a function of time.The top panel shows the raw flux and the bottom panel is the corrected flux rescaled by the square of the distance between the Sun and the Earth averaged over each 5-day bin.Data used for this plot has been obtained from[17].For brevity, we have ignored the X-axis bin width in the above plots.

FIG. 2 :
FIG. 2: L-S power as a function of frequency using the generalized L-S periodogram for the raw flux (top panel) and the corrected flux (bottom panel).

FIG. 3 :
FIG. 3: L-S Periodogram of Super-K data from April 1996 to July 2001 Power Bootstrap Naive Baluev Davies

TABLE I :
L-S power and p-value (last four columns) for the uncorrected 8 B Super-K flux using four different methods for the four frequencies showing the largest L-S powers in descending order.All the p-values are greater than 0.5, implying that none of the peaks are statistically significant, and there is no evidence for sinusoidal modulations in the data.

TABLE III :
L-S power and p-value (last four columns) for 8 B Super-K flux from April 1996 to July 2001 using four different methods for the four frequencies showing the largest L-S powers.