1 Introduction

The sunspot records represent the longest dataset of direct solar observations available (Vaquero et al., 2016; Usoskin, 2017; Arlt and Vaquero, 2020). The number of sunspots on the photosphere increases and decreases in cycles of around 11 years (Clette et al., 2014; Muñoz-Jaramillo and Vaquero, 2019). The prediction of the amplitudes of these solar cycles has undergone an increased interest because of the impact of solar activity on our technological society (McNish and Lincoln, 1949; Pulkkinen, 2007; Arregui, 2022; Carrasco and Vaquero, 2022).

Different techniques are used to predict solar activity (Pesnell, 2008; Petrovay, 2020). For instance, some of them, predicting the amplitude of the past Solar Cycle 24 and the current Solar Cycle 25, are based on space climatology (Kane, 2008; Wang et al., 2009; Carrasco and Vaquero, 2021), physical models (Upton and Hathaway, 2018; Bhowmik and Nandy, 2018), precursors (Hathaway and Wilson, 2004; McIntosh et al., 2020), and spectral analysis (Kilcik et al., 2009; Rigozo et al., 2011) (see more details in Nandy, 2021). Other types of predictions are those made from artificial intelligence using neural networks. In some works, this kind of prediction method has been used to predict the amplitudes of Solar Cycle 24 and 25 (Gholipour et al., 2005; Quassim, Attia, and Elminir, 2007; Okoh et al., 2018; Pala and Atici, 2019; Prasad et al., 2022) and also for giving a forecast of the butterfly diagram in space and time (Covas, Peixinho, and Fernandes, 2019).

Within the predictions using a neural network, Coban, Raheem, and Cavus (2021) predicted the evolution of Solar Cycle 25 using the technique known as the Long Short-Term Memory (LSTM) network. They used daily sunspot observations for the period 1945 – 2020 provided by the American Association of Variable Star Observers (AAVSO) as training data for the LSTM model. These authors claim that a model based on daily sunspot-number observations should be taken into account because it will be closer to reality than monthly smoothed data. Their conclusions were: i) the maximum amplitude of Solar Cycle 25 will be lower than that of Solar Cycle 24, and examining both cycles together, the Sun would enter in a new Dalton Minimum, and ii) the maximum amplitude of Solar Cycle 25 will be in July 2024, with two minor peaks in February 2022 and August 2026.

The objective of this work is to assess the methodology for prediction by Coban, Raheem, and Cavus (2021) and clarify whether Solar Cycle 25, together with Solar Cycle 24, will be a new Dalton Minimum. In Section 2, we describe the daily dataset used in this work. We provide information on the LSTM model in Section 3 and the predictions obtained using this deep-learning technique in Section 4. Section 5 is devoted to analysis and discussion of the results obtained and the main conclusions of this work are presented in Section 6.

2 Dataset

Coban, Raheem, and Cavus (2021) used the daily sunspot observations of the AAVSO (www.aavso.org/solar) to carry out their prediction. In our case, in addition to the use of the AAVSO data in order to reproduce the work by Coban, Raheem, and Cavus (2021), we also use the official daily sunspot-number values provided by the Sunspot Index and Long-term Solar Observations (SILSO: www.sidc.be/silso). Figure 1 (top panel) shows a comparison between the daily values of the sunspot number provided by SILSO (1818 – 2022) and that by AAVSO (1945 – 2022).

Figure 1
figure 1

(Top panel) Daily sunspot-number values of the AAVSO (red) for the period 1945 – 2022 and the sunspot number (Version 2) provided by SILSO (black) for the period 1818 – 2022. (Bottom panel) Ratio between annual (monthly) values from AAVSO and SILSO is represented by a blue thick (thin) line.

We have also computed the ratio between values from AAVSO and SILSO in order to study the stability of the AAVSO series with respect to that from SILSO (Figure 1, bottom panel). One can see that there were significant variations in the ratio from 1945 to 1970. Then, the ratio was around 0.7 (with an increase around solar minima) until 2000, when it changed and is slightly larger than 0.6.

3 Neural Network: LSTM Model

A LSTM network is a type of recurrent neural network (Muzaffar and Afshari, 2019; Abbasimehr and Paki, 2022). It is an algorithm used in machine deep learning developed to work with time series. This algorithm is capable of learning order dependence in sequential prediction problems if it is properly trained. The LSTM networks, as artificial neural networks, generally consist of multiple-layer units. Each layer can have several units, called neurons. As a minimum, each unit includes a forget, input, and output gate. The forget gate decides whether to keep the data, the input gate decides whether to update the information about the data, and the output gate decides which of the data transferred from previous input gates will be selected. The batch size (epochs) represents the amount of data used for training in each iteration.

LSTMs are useful in time-series prediction when there is a relationship between the time series and delayed versions of itself. To make a prediction with the stateful LSTM model, the training data and the delayed values of the data are fed as input data. This enables the stateful LSTM model to learn the relationship between the data and its preceding values. The autocorrelation values allow us to know the accuracy of the model. A positive autocorrelation function (ACF) close to one indicates that the model fits the time series well.

Large data sets will only yield successful predictions when the data are divided into subsets. This technique is called “backtesting” and consists of dividing the time series into different segments or slices of past observations. These slices will be overlapping although delayed in time by an amount called the skip-span. It means that the second slice starts a set number of years after the beginning of the first slice.

We have developed a stateful LSTM model, following the methodology of Coban, Raheem, and Cavus (2021). In order to make a 10-year (3650 days) prediction, we divide the data into two slices, each consisting of 36 years of data for training, 10 years of data for testing (not used for training but used finally for testing), and 15 years of data for skip-span. Our LSTM model, as that of Coban, Raheem, and Cavus (2021), has two layers with 50 units per layer (Table 1). Some parameters must be specified in the model, such as the “Return sequences”, which are given as “True” values for the first layer (Lstm_0) and set to ‘False’ for the second layer (Lstm_1), and also the “stateful” parameter that, in both layers, will be considered as “True”. The number of units of the output layer (Dense) is 1 and provides the predicted daily sunspot number. The stateful LSTM model was fed using a batch size set of 730 (an integral number must equal the length of the data set). Finally, we set as the iteration parameter 300 epochs.

Table 1 Structure of the stateful LSTM model defined in this work following the methodology of Coban, Raheem, and Cavus (2021).

We have applied the ACF over a lag less than or equal to 365 times 50 years, obtaining an optimal lag setting of 0.48 in agreement with Coban, Raheem, and Cavus (2021). This is enough to consider the stateful LSTM model able to learn the relationship between data and its preceding values. Note that solar activity is hardly predictable beyond one solar cycle ahead (Petrovay, 2020; Nandy, 2021). The predictions by neural networks are made by mathematical models based on the assumption of the series’ stationarity. Thus, we remark this limitation of the LSTM model, since the sunspot number is not a stationary series.

The R library rsample and the keras package, which connects to the R “TensorFlow backend”, using R Version 3.6.0 under RStudio Version 1.3.1093 were used to develop the stateful LSTM model.

4 Predictions Using Daily AAVSO and SILSO Observations

We have reproduced the methodology followed by Coban, Raheem, and Cavus (2021) to predict the maximum amplitude of Solar Cycle 25. Thus, we have applied a LSTM model using daily observations, on the one hand provided by AAVSO and on the other hand by SILSO for the period January 1945 – September 2020.

The daily sunspot-number series from AAVSO and SILSO are divided in two different slices (one for the period January 1945 – December 1990 and other for the period December 1959 – December 2005), including 36 years for training, 10 years for testing, and 15 years for the skip-span. Thus, we can check the accuracy of the model throughout the observed data. Figure 2 and 3 includes the AAVSO and SILSO daily sunspot observations divided into those two slices.

Figure 2
figure 2

Division of the daily sunspot observations form AAVSO into slices for the period January 1945 – December 1990 (top panel) and December 1959 – December 2005 (bottom panel). Training data are presented by black lines and testing data by red lines.

Figure 3
figure 3

Division of the daily sunspot observations form SILSO into slices for the period January 1945 – December 1990 (top panel) and December 1959 – December 2005 (bottom panel). Training data are presented by black lines and testing data by red lines.

After training the LSTM model in Slices 1 and 2, included in Figure 2 for AAVSO and Figure 3 for SILSO, we obtained predictions of daily values for the test years: i) one around the solar minimum of Solar Cycle 23 (December 1980 – December 1990) and ii) the other around the solar maximum of that solar cycle (December 1995 – December 2005). These daily values predicted for the model around the minimum and maximum of Solar Cycle 23 are represented in Figure 4 for AAVSO and in Figure 5 for SILSO. The analysis of the behavior of these predicted values for the model with respect to the observed values is analyzed in Section 5.

Figure 4
figure 4

Daily sunspot observations from AAVSO are represented by black dots. Predictions of daily sunspot-number values from LSTM model are depicted by the red dots. RMSE values are shown for Slice 1 (top panel) and Slice 2 (bottom panel).

Figure 5
figure 5

Daily sunspot observations from SILSO are represented by black dots. Predictions of daily sunspot-number values from LSTM model are depicted by the red dots. RMSE values are shown for Slice 1 (top panel) and Slice 2 (bottom panel).

We use the root mean square error (RMSE) to check the accuracy of the LSTM model in the two slices. We obtained a RMSE value equal to 52.5 for the first slice and 40.1 for the second slice regarding AAVSO observations. We note that Coban, Raheem, and Cavus (2021) obtained a RMSE value of 52.5 for the first slice and 39.0 for the second one. One can see that the RMSE value obtained in the Slice 2 is not the same as Coban, Raheem, and Cavus (2021), but it is close. We highlight that this difference can be due to the RMSE value slightly changing in each running of the code from the original values, but it does not have influence in the prediction response of the neural network. In the case of the SILSO data, we obtained RMSE values larger than those from AAVSO data: 74.1 for the first slice and 54.2 for the second slice.

The daily sunspot-number values predicted by the LSTM model for the period October 2020 – September 2030 are represented in Figure 6 (top panel) in the case of using AAVSO data and in Figure 6 (bottom panel) for the SILSO data. The 13-month smoothed monthly average is also included to be compared with the observed values of the official sunspot number during the current Solar Cycle 25.

Figure 6
figure 6

Prediction of AAVSO (top panel) and SILSO (bottom panel) sunspot-number values for the period October 2020 – September 2030 applying an LSTM model using daily sunspot observations from AAVSO and SILSO for the period 1945 – 2020. Dots represent daily values and the line depicts the 13-month smoothed monthly average.

We also note that we have applied the LSTM model using data from SILSO since 1872 (that is first year since which the daily observational coverage is 100%) with the same configuration of the neural network as explained above. However, we did not obtain a behavior of the model better than that obtained previously, that is the RMSE values obtained are higher than those shown above. Therefore, we did not carry out an analysis of other predictions using that dataset.

5 Results and Discussion

First, we analyze the comparison between the predicted values by the LSTM model and the observed sunspot-number values for the data considered for test in the slices defined in the previous section. In this case, as our objective is to analyze the prediction for solar maxima, we decided only to consider the analysis of Slice 2, which contains the predicted values for test (Solar Cycle 23), in addition to the lowest RMSE values. Figure 7 shows the comparison between the predicted 13-month smoothed monthly sunspot number obtained from the LSTM model using AAVSO (top panel) and SILSO (bottom panel) data and the observed values from AAVSO and SILSO, respectively, during the period for test: June 1996 – June 2005 (corresponding to Slice 2).

Figure 7
figure 7

Comparison between the 13-month smoothed monthly sunspot-number values estimated from the LSTM model (orange) and the observed 13-month smoothed monthly sunspot number considering the data for test in the Slice 2 using AAVSO (top panel) and SILSO (bottom panel) data. The ratio between the observed and predicted values is depicted by the dashed black line. The blue shaded area represents the interval defined by the observed 13-month smoothed monthly sunspot-number value ± one standard deviation.

Coban, Raheem, and Cavus (2021) claim that the 13-month smoothed monthly average of the sunspot-number values calculated from predicted daily values by the model fits the observed data well in the case of data for test (both in Slices 1 and 2), because all of the predicted values are within the interval of the observed 13-month smoothed monthly sunspot number values ± one standard deviation. However, Coban, Raheem, and Cavus (2021) actually used for monthly values of standard deviation the RMSE value obtained in each slice. Regarding data for test in the Slice 2 (Solar Cycle 23), we also obtained that all the predicted values, calculating the 13-month smoothed monthly average from the predicted daily values, are within the interval of the observed 13-month smoothed monthly average values ± the RMSE value (that is, 40.1 in the case of AAVSO and 54.2 in the case of SILSO). Instead, the monthly standard deviation is defined as:

$$ \sigma = \sqrt{\frac{1}{N} \sum _{i=1}^{N} \left ( SN_{i} - SN_{\mathrm{m}} \right )^{2}} $$

where in this case \(N\) is the number of observations in a month, \(SN_{{i}}\) represents the daily sunspot-number values, and \(SN_{\mathrm{m}}\) depicts monthly averages of sunspot numbers for a given month. Thus, we obtained that around 15% of the monthly data in the case of AAVSO and 30% from SILSO are out of the interval defined by the observed 13-month smoothed monthly average data ± the standard deviations (see Figure 7).

We have also calculated the ratio between the observed and predicted 13-month smoothed monthly average values both from AAVSO and SILSO data (dashed black lines in Figure 7), considering the test period (Solar Cycle 23). One can see that the evolution of predicted and observed values is significantly different during roughly the two first years of the prediction. Then, from the end of 1998 in the case of AAVSO and the beginning of 1999 for SILSO, the behavior between the observed and predicted values is closer and the ratio ranges between values of 1.2 and 0.8.

Analyzing the predicted values for the maximum of Solar Cycle 23 (187.6 and 115.3 according to SILSO and AAVSO, respectively), it is only 4% larger with respect to the observed value in the case of SILSO (180.3) and around 8% lower in the case of AAVSO (125.9). Furthermore, the maximum of Solar Cycle 23 (November 2001) occurred 11 months later than the predicted date for the maximum using the SILSO data (December 2000) and 15 months later considering the AAVSO data (September 2000).

Coban, Raheem, and Cavus (2021) claimed that “a model based on daily sunspot-number observations should be taken into account because it will be closer to reality than monthly smoothed data”. We have compared the predictions obtained in this work for Solar Cycle 25 following the methodology by Coban, Raheem, and Cavus (2021) with the sunspot-number values available for this cycle so far (December 2022). Figure 8 shows the observed values of the official sunspot number by SILSO so far (black) and the predictions made using AAVSO (red) and SILSO (green) data (explained in Section 4), respectively. Furthermore, to scale the sunspot observations made by the AAVSO to the official sunspot number (Version 2) by SILSO, we must apply a calibration factor to the AAVSO data. We have obtained this factor following the calculations by Clette (2018):

$$ k= {\left ( \sum \frac{R_{\mathrm{S}}}{R_{\mathrm{A}}} \right )} / {n_{\mathrm{m}}} $$

where \(R_{\mathrm{S}}\) and \(R_{\mathrm{A}}\) are the monthly sunspot-number values from SILSO and AAVSO, respectively, and \(n_{\mathrm{m}}\) is the number of months with observations in common in both datasets. Thus, the calibration factor for AAVSO with respect to SILSO is 1.55 ± 0.53. Applying this factor to the original prediction from AAVSO data, we obtained the calibrated AAVSO series depicted in Figure 8 in yellow.

Figure 8
figure 8

Comparison between the observed 13-month smoothed monthly sunspot-number values provided by SILSO (black line) and the predictions made applying the methodology by Coban, Raheem, and Cavus (2021) using data from SILSO (green) and AAVSO (yellow). Note that the yellow line represents the prediction using AAVSO data applying the calibration factor with respect to the official sunspot number by SILSO. Gray shading depicts the interval defined by the observed 13-month smoothed monthly sunspot-number value ± one standard deviation. The red dot represents the smoothed sunspot-number value for February 2022.

One can see in Figure 8 that the predictions applying the methodology of Coban, Raheem, and Cavus (2021) using daily data significantly differ from the behavior of the observed values available for Solar Cycle 25. While the predicted values overestimate the observed values considering the first months of the cycle, the observed values are larger than the predictions from February 2022. This behavior is similar to that in Figure 7, which uses data of Solar Cycle 23 for testing how well the predictions of the model fit the observed values. The first predicted values of Solar Cycle 23 are significantly higher than the observed values and, in addition, they are out of the interval defined by the observed 13-month smoothed monthly averaged sunspot number ± the standard deviation. Then the observed values are larger than the predicted values until the first maximum of Solar Cycle 23, and from then until the next minimum the predicted values are larger than the observed ones.

The average of the monthly differences between observed and predicted values from April 2021 to May 2022 is around 30% and 35%, in the cases of using SILSO and calibrated AAVSO data, respectively. We do not know if the predicted values for the maximum amplitude of Solar Cycle 25 obtained in this work using the methodology of Coban, Raheem, and Cavus (2021) will be close to the final value of the maximum amplitude of this cycle (such as seen in data of Solar Cycle 23 for testing), but we can affirm that the predictions based on an LSTM model using daily data are not reproducing well the behavior of Solar Cycle 25 in its rising phase.

One of the conclusions made by Coban, Raheem, and Cavus (2021) from their prediction is that the maximum amplitude of Solar Cycle 25 will be in July 2024 with two minor peaks in February 2022 and August 2026. Figure 8 depicts the 13-month smoothed monthly sunspot-number values provided by SILSO for Solar Cycle 25 from its beginning in December 2019 to those available so far (December 2022) by the black line. One can see that the minor peak of solar activity predicted by Coban, Raheem, and Cavus (2021) in February 2022 is not present in the observed values because the 13-month smoothed monthly sunspot-number values for March (68.8), April (73.1), and May (77.3) are larger than the observed value in February (64.7), represented by the red dot in Figure 8. Moreover, in the case of using data from SILSO to make the prediction, there is also one minor peak in February 2022. Therefore, in this case, the predictions using both AAVSO and SILSO data fail following the methodology of Coban, Raheem, and Cavus (2021).

The most striking conclusion in the work by Coban, Raheem, and Cavus (2021) is that the authors state “if the Solar Cycles 24 and 25 are examined together, the Sun may enter a new Dalton Minimum”. To stress this conclusion, first we note that the maximum amplitudes of Solar Cycle 5 and 6 (in the Dalton Minimum) in terms of 13-month smoothed monthly sunspot number were 82.0 and 81.2, respectively. Also, the maximum amplitude for the most recent Solar Cycle 24 was 116.4. Regarding the prediction obtained following the methodology by Coban, Raheem, and Cavus (2021) using calibrated data from AAVSO (120.7) and sunspot-number values from SILSO (94.1), the average of the amplitudes of Solar Cycle 24 and 25 is 118.6 ± 3.0 and 105.2 ± 15.8, whereas it is 81.6 ± 0.6 for the two cycles in the Dalton Minimum. Then, taking as reference the predictions obtained following the methodology by Coban, Raheem, and Cavus (2021), the solar-activity level examining together Solar Cycle 24 and 25 is between roughly 30% and 45% larger than that in the Dalton Minimum in the case of regarding the predictions made using SILSO and AAVSO data, respectively.

We have also analyzed this fact using other reference predictions. One of them is that made by the Solar Cycle 25 Prediction Panel, which predicts a maximum in July 2025 of 114.6 ± 10.0 (Biesecker and Upton, 2019) and the sunspot-number average of the predictions analyzed by Nandy (2021) made by different research groups around the world, that is, 136.2 ± 41.6. In these cases, the solar-activity level of Solar Cycle 24 and 25 would be between 40% and 67% larger, respectively, than that occurred in the Dalton Minimum.

Regarding all of these predictions, solar activity in Solar Cycle 24 and 25 will be more similar, for example, to the average of Solar Cycle 12, 13, and 14, that is, 126.0 ± 19.7. Also, we note that the solar-activity level of Solar Cycle 25 (77.3) at this point of the cycle (December 2022) is closer to that of Solar Cycle 9 (81.7), 12 (86.4), 14 (69.3), and 24 (69.3), and significantly larger than that of Solar Cycle 5 (30.8) and 6 (19.1). Therefore, considering all these facts, it is unlikely that Solar Cycle 24 and 25 are a new Dalton Minimum, but rather a new secular minimum of Gleissberg cycle.

6 Conclusions

We have reproduced the methodology of Coban, Raheem, and Cavus (2021) to evaluate their prediction on Solar Cycle 25. Thus, we have used a LSTM model with daily sunspot-number data from AAVSO (such as Coban, Raheem, and Cavus, 2021) and also from SILSO that provides the official values for the sunspot-number index, to carry out the forecasts. The daily sunspot-number series were divided into two slices including 36 years for training, 10 years for testing, and 15 years for the skip-span. We found correlation between the time series and lagged versions of itself applying the ACF over a maximum lag lower than 365 times 50 years, and obtaining an optimal lag of 0.48. The LSTM model includes two layers with 50 units per layer and the batch size set to 730 and iteration (epochs) to 300 as training parameters for the LSTM model.

To check the accuracy of the model, we have predicted the sunspot-number values for the period December 1995 – December 2005 (Solar Cycle 23) and compared them with the observed sunspot values, as by Coban, Raheem, and Cavus (2021). One can see that the predicted values are significantly different from the observed values at the beginning of the cycle, and then their ratio ranges between 1.2 and 0.8. We also found that the predicted value for the maximum of Solar Cycle 23 using SILSO data is 4% larger than the observed value, which occurred 11 months later than the predicted one. In the case of using AAVSO data, the predicted value is 8% lower than the observed one, with a difference of 15 months between the predicted and observed dates for the maximum.

Regarding the analysis of the prediction for the current Solar Cycle 25, one can see that the LSTM model, based on daily sunspot-number observations, does not reproduce well the behavior of the observed values available for Solar Cycle 25. Also, there was no minor peak of solar activity in February 2022 such as predicted by Coban, Raheem, and Cavus (2021). Finally, the average between the maximum amplitude of Solar Cycle 24 and that predicted applying the methodology by Coban, Raheem, and Cavus (2021) for Solar Cycle 25 is 118.6 ± 3.0 using AAVSO calibrated data and 105.2 ± 15.8 using SILSO data. We note that the maximum amplitude average of Solar Cycle 5 and 6 (Dalton Minimum) is 81.6 ± 0.6. This means that the solar activity in the modern Solar Cycles 24 and 25 would be around 30 – 45% larger than that occurred during the Dalton Minimum. We also note that the observed 13-month smoothed monthly sunspot number at this point of Solar Cycle 25 (December 2022) is 77.3, whereas it was 30.8 and 19.1 in the case of Solar Cycles 5 and 6, respectively. Therefore, we conclude that a new Dalton-type minimum is unlikely to be composed by Solar Cycles 24 and 25. We think that it is more plausible, analyzing many other predictions indicated above, that the set of the two Cycles 24 – 25 forms another secular minimum of the Gleissberg cycle, similar, for example, to that around the beginning of the 20th century.