Introduction

Cyclical variations of the cosmic rays arriving on the Earth, primarily due to the sunspot, are related to changes in the Sun’s surface activity. The reduction of the activity of the Sun occurs almost every 11 years, and the minimum sunspot number leads to an increase in cosmic ray flux reaching the Earth and causes mutations of viruses (Qu and Gao 2016). The effect of the solar magnetic field decreases during the minimum sunspot, and the cosmic ray enters the atmosphere with high-energy particles from external and extragalactic sources. Moreover, the particle collision with the atoms on the surface of the atmosphere leads to secondary particles that float on the Earth (Pinholster 2013). Furthermore, solar activity at the maximum or lowest period may have a major impact on the Earth’s atmosphere, leading to extreme climatic conditions, including severe temperatures, precipitation, and intense cold (Seneviratne et al. 2012). For COVID-19 and other pandemics in the world, the sunspot cycle may be an important factor. With the prolonged changes in solar activity and sunspot number, climate change and global warming may occur (Gupta et al. 2015; Qu and Wickramasinghe 2017). The ability of animals to identify their interspecies interactions with competitors and prey can be directly influenced by global warming (Luo et al. 2014). Consequently, a link between the sunspot cycle and infectious diseases that might have been anticipated could both lead to significant changes in the flow of ionizing radiation to the Earth and also create gates of rapid descent nanometer-size particles. Hope-Simpson (1978) was the first person to investigate the occurrence of influenza pandemics during the highest solar activity at intervals of around 11 years in the solar pandemics and infection pandemics. Hoyle and Wickramasinghe endorsed this in 1990, but Von Alvensleben called it into question (Andrews 1990; F. Hoyle and Wickramasinghe, 1990). The challenge was based on the biological view that is purely centered on the Earth and that perhaps needs revision. Ertel also studied possible or validated pandemics from 1700 to 1985 and concluded that peak years coincident with the period of minimal solar activity during the eighteenth-century pandemics (Ertel 1994). Scientists offered more complicated ideas on the relation of flu to the Sun activity in the nineteenth and twentieth centuries. For example, Tapping et al. (2001) reached the conclusion that pandemics seem to coincide with the highest solar activity by studying influenza pandemics and sunlight activity as early as in the 1700s, and the chance that this would be a pure coincidence is < 2%. While Tapping et al. (2001) have denied a causal connection between these events, academics generally agreed as regards the correlation between solar and environmental activities. By carrying out a binomial test with a number of sunspot numbers internationally, Yeung (2006) developed a relationship between influenza pandemics and increase in solar activity; he showed that seven sunspot pandemics correlated between 1700 and 2000 with incrementing solar activity and an international sunspot sensitivity of 85.7% for pandemic detection (95% interval of confidence: 59.8 to 100%; p = 0.019). Vaquero and Gallego (2007) used group sunspot numbers to prove those results. His results were much the same as those of Yeung in 1880 and earlier (Yeung 2006). Researchers have also recently indicated that influenza pandemics could be infectious diseases that have been a problem for public health for centuries.

Solar activity, including total solar radiation and X-ray flares, correlates with sunspot numbers. Throughout the solar maximum activity, large numbers of mass ejections and solar flares contain high-energy and low-energy solar particles (Hathaway and Wilson 2004). The strength of cosmic rays approaches the average and minimum intensity levels, respectively, under minimum and normal solar activities (O’Sullivan 2007). The Earth is shielded by a magnetic field against solar particles and cosmic rays. However, during the maximum and minimum of sunspot, this magnetic field cannot withstand certain elements. Three molecular mechanisms, namely, point mutations, gene recombination, and gene range, are responsible for the emerging pandemic virus strains (Domingo 2010). Solar and cosmic rays may be a physical mutagen that causes point mutations that could contribute to a pandemic caused by COVID-19 strains. Point mutations in RNA on a more or less constant basis cause relatively small improvements in the virus’ genetic drifts. However, the rise of pandemic strains needs significant genetic transformations or shifts involving a significant reassortment of viral genes, including the progressive possibility of adding new segments of RNA that have an external origin to the Earth (Hoyle and Wickramasinghe 1961, 1990a, b). In the latter case, changes in the Earth’s electromagnetic environment, which is in turn controlled by the solar magnetic field and solar wind, can affect the Earth’s externally originated transport virions. The magnetic field of the Sun reverses through the solar cycle, and at times of solar maxima and minima, it may be reasonable to expect more efficient transport of virions from the stratosphere to the ground (Wickramasinghe 2014). While the probability of externally produced virions may be considered controversial, it is consistent with aspects of the epidemiology of infectious diseases that cannot be clarified otherwise. Genetic recombination is motivated by the multiplicity and cross-reactivation of influenza viruses which are known to occur in laboratories during the 1950s and 1960s to be radiated by the ultraviolet light and gamma rays (Barry 1961; HENLE and LIU 1951; Kilbourne 1957). Recombination and re-sorting occur in RNA viruses at extremely variable frequencies. Influenza is an antigenic drift or antigen shift that is a segmented RNA virus that can change its genetics. The second category consists of reassortment of gene segments between virus particles which can generate the types of antigenically highly infectious new strains associated with global pandemics (Hsieh et al. 2006). The viral acquisition of genetic material from birds or other species that can, in turn, carry in new virions from space can also result in new subtypes including COVID-19, results of genetic re-sorting. Since 1918, nearly all influenza pandemic strains have been experimentally identified as recombinant influenza viruses for humans, avian, or swine (Cohen 2009; Schäffr et al. 1993; Taubenberger et al. 2001), even though the facts, especially regarding the epidemiology, may not be explained in this explanation (Hoyle and Wickramasinghe 1961, 1990a, b; Weinstein 1976). Maximum or limited activities in sunspots can have a significant impact on the atmosphere and can lead to storms, hurricanes, and extreme winters (Gachari et al. 2014; Ineson et al. 2011). The weather condition is also a factor that affects on scattering and outbreak of infectious disease on the Earth (Ahmadi et al. 2020).

In this paper, we assess the possible effects of sunspot numbers on the world virus appearance. The literature has no sufficient results about these phenomena. So, in this study, we try to connect solar ray extrema to virus generation and the history of pandemics. Therefore, we used wavelet decomposition for smoothing the sunspot cycle to predict past pandemics and forecast future time of possible virus generation using multi-step autoregression (MSAR). Finally, we investigate the geographical appearance of viruses in the world to show vulnerable places in the world.

Methods and materials

Wavelet decomposition

Wavelet decomposition includes passing input signals x on low pass g and high pass h filters which resulted in Eqs. (1) and (2). The impulse responses are calculated as follows:

$${y}_{Low\ pass}\left[n\right]=\sum \limits_{k=-\infty}^{\infty }x\left[k\right]g\left[2n-k\right]$$
(1)
$${y}_{High\ pass}\left[n\right]=\sum \limits_{k=-\infty}^{\infty }x\left[k\right]h\left[2n-k\right]$$
(2)

The detail coefficient (from yHigh pass) and approximation coefficients (from yLow pass) constitute the decomposition results. To down-sample the decomposition, we can extract the main features of the signal by removing outliers. The following diagram in Fig. 1 shows the process. As half of the signal wavelengths have already been eliminated, under Nyquist’s law, half of the samples should be discarded. The output of the low-pass filter g is then sub-sampled by 2 and further processed by passing it through a new low-pass filter g and a high-pass filter h with half the cut-off frequency of the previous one again. The time resolution has been halved by this decomposition since only half of each filter output characterizes the signal. However, each output has half of the input’s frequency band, so the resolution of the frequency has been doubled.

Fig. 1
figure 1

The process of four-level 1-D wavelet decomposition

In this process, the input signal is decomposed into four- to five-level sub-bands consisting of hi[n] ∣ i = 1, . . , 4 in four-level g4[n] on the last level. These sub-bands with transformation can build the main signal of x[n]. The feature of the signal is saved in gi[n] such that after several decomposition processes, the main sub-band is g4[n]. In some studies, the hi[n] is considered outliers, and in other types of research, it is used as features of the signal (Akansu et al. 2010).

Autoregressive models

Autoregressive (AR) explains time-varying signals in a random process. It makes a relationship between elements in a time series with its past values, linearly on a stochastic difference equation. Other forms of AR are autoregressive integrated moving average (ARIMA) and vector autoregressive model (VAR) involved with multiple variables. For processing a time series Xt, the AR model is calculated as follows (Shumway and Stoffer 2017):

$${X}_t=\sum \limits_{i=1}^p{\phi}_i{X}_{t-i}+{\epsilon}_t+c$$
(3)
$${X}_t=\sum \limits_{i=1}^p{\phi}_i{B}^i{X}_t+{\epsilon}_t+c$$
(4)

where p is the order of the AR model and ϕi is the estimation parameters of the model. Also, c describes a constant value, and ϵt is called white noise. The following Eq. (3) can be written as Eq. (4) with a definition of B as a backshift operator. AR model can be used to forecast the time-varying signals in the future or unobserved past time. The process of out-sample forecasting is performed after the prediction process or estimation of the parameters (Shumway and Stoffer 2017).

Sunspot number and cosmic rays

Sunspot number of wolf numbers represents the number of sunspots on the surface of the Sun that can be seen from the Earth. The international sunspot number series is still being produced today at the observatory of Brussels (SIDC 2020). The international number series shows an approximate periodicity of 11 years, the solar cycle, which was first found by Heinrich Schwabe in 1843; thus, sometimes, it is also referred to as the Schwabe cycle. The periodicity is not constant but varies roughly in the range of 9.5 to 11 years. The international sunspot number series extends back to 1700 with annual values, while daily values exist only since 1818.

An index of the behavior of the entire visible disk of the Sun is the relative sunspot number. Each day is calculated without reference to the previous days. An isolated cluster of sunspots is referred to as the group g of the sunspot and can consist of one or a large number of separate spots s whose size may range from 10 or more square degrees of the solar surface to the limit of resolution (e.g., 1/25 square degrees). The relative sunspot number R is computed using the formula (collected as a daily index of sunspot activity):

$$R=I\left(10g+s\right)$$
(5)

where I is a scale factor that varies with location and instrumentation that is known as the observatory factor or the personal reduction coefficient. Also, g is the number of sunspot groups, and s is the number of individual spots. A different index of sunspot activity was introduced in 1998 in the form of the number of groups that appeared on the solar disc (Hoyt and Schatten 1998). With this index, it was made possible to include sunspot data acquired since 1609, being the date of the invention of the telescope. While sunspots with a temperature of about 45,000 °C are somewhat colder than other surfaces of the sun, warmer flares not only compensate for the coldness of the stains but also produce excess heat. The part of the surface of the Sun that is near the spots becomes very hot and comes out, and the hot gaseous substance is fired into space and creates a solar eruption. This is because approximately once every 11 years, when the number of these spots and then eruptions reach a maximum, the Earth gets warmer, and we should expect drought on the Earth’s surface (Table 1).

Table 1 Frequency and relative energy of the wavelet decomposition sub-bands

Results and discussion

Data description

The WDC-SILSO is an activity of the Operational Directorate “Solar Physics and Space Weather” also known internationally as the Solar Influences Data Analysis Center (SIDC). Its mission is to preserve, develop, and diffuse the knowledge of the long-term variations of solar activity, as a reference input to studies of the solar cycle mechanism and the solar forcing on the Earth’s climate. The work realized at WDC-SILSO is under supervision from the IAU, URSI, and IAGA (Royal Observatory of Belgium 2015). They provide endorsement of the World Data Center SILSO, concerning the scientific usefulness and quality of the sunspot data produced and archived by this data center. The dataset of the study is extracted from SIDC for daily and monthly average data of sunspot number (SILSO SILSO World Data Center n.d.).

Results of the decomposition and smoothing of sunspot number

Sunspot numbers are recorded with daily values that have fluctuation values. Therefore, modeling and analyzing an oscillating time series are challenging in mathematical science. The daily sunspot number is not exceptional in this matter. Therefore, we designate some methods for analysis and modeling of this fluctuating time series as seen in Fig. 2a. For achieving a smooth plot and preventing some unexpected errors, the monthly average value of the sunspot is considered the main fluctuating time series. In the analysis of the pandemic’s occurrence time, we should maintain the extrema of the monthly average plot because the literature review showed that pandemics’ probable reason is sunspot maximum and minimum values. Therefore, in this paper, we used wavelet decomposition to reduce fluctuation and extract the main feature of the monthly averaged sunspot number. For achieving this goal, we used a four-level single dimension wavelet as shown in Fig. 1.

Fig. 2
figure 2

Results of sunspot smoothing process: a Daily sunspot and the monthly average value of sunspot number from 1750 to 2020. b Results of wavelet decomposition on monthly average sunspot number. c Fitting plot of smoothing value of sunspot number in comparison with monthly average values. d Smoothing plot of monthly average sunspot number based on wavelet decomposition

Sub-bands of h1 to h4 are extracted fluctuating feature of sunspot, and the results of smoothed value are calculated at approximation coefficient of 1-D wavelet decomposition in g4 sub-band. Each of the wavelet sub-bands is depicted in Fig. 2b.

The sub-band of g4 has 95.38% of the relative energy of the main sunspot number time series and can be substituted with monthly average values. Based on the fitting plot, it can correctly be fitted with main values regarding Fig. 2c, d. We underestimate other sub-bands as outliers and surely used g4 sub-band for future analysis. The resulted sub-band maintains relative extrema of amplitudes of the monthly average sunspot number with high accuracy.

History of sunspot number and pandemics

In this section, we analyze the coincidence of the world’s great pandemics with sunspot number relative extremums. Figure 3 shows the history of the world’s great pandemic on the sunspot number plot. Regarding Fig. 3, the main pandemics occurred in sunspot number relative extrema. Figure 3 also shows the connection between sunspot and infection cycles, 1750–2020. Based on Fig. 3, the sunspot number oscillates almost every 11 years to the maximum value and almost 5 years from maximum to minimum points. Based on the literature review, this cycle also causes great deficiencies in the world. The recent COVID-19 virus was also generated in a relative minimum of sunspot number in 2019. Before COVID-19, other epidemics of MERS and Ebola, Swine Flu, and SARS occurred in recent cyclic extrema respectively in Max, Min, and Max value. Table 2 shows the history of pandemics that occurred in the world. The first column shows the pandemic outbreak time interval reported by references. However, the second column indicates the extremum time of sunspot monthly average values. Some of the pandemics that highlighted in orange color are pandemics that coincide with their first observation and sunspot extremum. However, green rows show the pandemics that reported 1 year after the sunspot extremum. It could be because of errors in pandemics reports of some government or errors in recording in the history. Therefore, 27 pandemics (from 36) coincide with sunspot extremum. Other great pandemics like Hong Kong Flu, Africa trypanosomiasis, Cholera, typhus, and Malaria indicated with white rows are described with time intervals. Resulted near extrema shows that this point is between these intervals. Considering all pandemics with obtained extremum points, we can conclude that sunspot extrema coincide with the pandemics’ first appearance probably because of mutation on virus DNA or generation of a new virus. It is probably because of the magnetic field that the Earth is exposed in the periods with the higher and lower number of sunspots recorded.

Fig. 3
figure 3

The plot of coincidence of the world great viral pandemics on monthly averaged sunspot number

Table 2 History of pandemics in the world between 1772 and 2020

Regarding Fig. 3, we indicate world great pandemics on the monthly averaged sunspot number from 1750 to 2020. In each cycle, virus generation occurs in a special site of the world. In Fig. 3, we can show some extremum points without any reported virus incidence. These points probably had a virus generation that perhaps cause weak viruses or not reached humans. Moreover, based on the literature, humans are the second hosts of viruses and are usually infected by an animal (Mallapaty 2020).

Forecasting future value of sunspot number

In this section, we forecast the future value of sunspot numbers based on values reported from 1750 to 2020. In this process, we analyze several methods from traditional methods of ARIMA to modern approaches of long short-term memory (LSTM) with different architectures. For achieving a meaningful result, we presented a MSAR method. In MSAR, we analyze extremum time in the sunspot number time series. Results show that the extremum values of the sunspot numbers have occurred almost every 11 years. Regarding Fig. 4, this value scattered from 8.5 to 14.5 years.

Fig. 4
figure 4

Time interval length for relative maximum (blue) and relative minimum (red) points

For forecasting sunspot numbers for future years, we forecast sunspot numbers in three steps of MSAR. First, we forecast the maximum values for ten cycles. Second, forecasting is done for the minimum values of the sunspot. Both forecasted values of extremum are almost near to real data. Third, forecasting is done for the time interval between extremum (interval between the maximum and next minimum). This forecasting completed three-step methods using MSAR approaches. Regarding the results of Fig. 5, we forecast future pandemics in the world for about 110 years or 10 cycles. The result shows that the oscillation of minimums shows in low amplitude; however, maximum values are shown in the high variance of sunspot number. This fact can be seen in past future values. In some periods, the difference between maxima and minima is lower than in other periods, for example, from before 1802 to 1830, 1883–1928, and 2019–2030. These periods experience many types of viruses in the world, and reports of some viral disease similar to influenza were prevalence. The future genetical investigation needs to prove these claims, while mathematical evidence resulted that the oscillation amplitude is important for the prevalence of viruses in the world.

Fig. 5
figure 5

Forecasting values of sunspot number for 10 cycles

Global retrospective of pandemics and sunspot number extrema

In this section, we presented a retrospective view of the history of pandemics based on geographical places. Figure 6 shows the list of pandemics that occurred in the past from 1750 to 2019 on the world map. Points are placed in approximate places and can be tolerated in other near areas. In this section, the main goal is to find a meaningful reason for the appearance of viruses in such places.

Fig. 6
figure 6

History of pandemics on the world map. Red circles: pandemics in maximum sunspot number. Green circles: pandemics in minimum sunspot numbers. Blue lines: next epidemic is in the east of previous one. Red lines: next epidemic is in the west of previous one. Green surface: novel pandemic area. Yellow surface: old pandemic area

In this map (Fig. 6), we indicate pandemic based on sunspot extremum and direction of previous pandemics. Red circles show the pandemics that occurred in the maximum value of sunspot, and green circles are pandemics in minimum values of sunspot numbers. There is no significant difference between maximum and minimum sunspot numbers based on the map, and they are scattered on the map at any place. Blue lines show that future pandemics would occur in the east of previous ones, and the red lines show vice versa. Based on pandemics that occurred from 1918 until 2019 (9 cycles or 101 years), the main surface of viruses’ first appearance is from china to the American continent in the green surface in Fig. 6. The main origin of viruses is tolerated among three areas, China, West Africa, and the USA. However, reports show that before 1918, past pandemics widespread in red surfaces regarding Fig. 6. In this paper, we present the hypothesis of the relationship between solar activity and virus generation in history. The initial observational results show that the main time of pandemic occurrence coincides with solar extrema. Some of them occurred in maximum, and some of them in the minimum value of sunspots. We can add this point that the biological generation of viruses is related to the magnetic field or other rays. Moreover, the first hosts of new viruses were animals, and the small difference between reported pandemic time and solar extrema is justifiable. Moreover, future experimental analysis is needed to prove this hypothesis.

Conclusion

Solar activity, including total solar radiation and X-ray flares, correlates sunspot numbers so that the solar magnetic field and solar wind affect the Earth to originate viruses. For analysis of the history of pandemics in the world, we analyzed solar activity between 1750 and 2020. In this process, the input signal of the sunspot numbers is decomposed in four-level to wavelet sub-bands to extract the main feature of the sunspot number. The dataset of the study is extracted from SIDC for daily and monthly average data of sunspot numbers. We used wavelet decomposition to reduce the fluctuation of the monthly averaged sunspot number. The low band of wavelet has 95.38% of the relative energy of the main sunspot number time series and can be substituted with monthly average values with maintaining extrema. We analyzed the coincidence of the world great pandemics with sunspot number relative extrema and designate some methods for analysis and modeling of this fluctuating time series. Results show that the world’s main pandemics occurred in sunspot number relative extrema. Sunspot number oscillates almost every 11 years to the maximum values and almost 5 years from maximum to minimum points. We can conclude that sunspot extrema coincide with the pandemics’ first appearance probably because of mutation on virus DNA or generation of a new virus. It is probably because of the magnetic field that the Earth is exposed in the periods with the higher and lower number of sunspots recorded. We can show some extremum points without any reported virus incidence. These points probably had a virus generation that perhaps cause weak viruses or not reached humans. We then forecast future pandemics in the world for about 110 years or 10 cycles using the presented MSAR methods. The result shows that the oscillation of minimums shows in low amplitude; however, maximum values are shown in the high variance of sunspot number. It can be also seen that there is no significant difference between maximum and minimum sunspot numbers based on the map, and they scattered on the map at any place. Based on pandemics that occurred from 1918 until 2019 (9 cycles or 101 years), the main surface of viruses’ first appearance is from China to the American continent in the green surface in Fig. 6. The main origin of viruses is tolerated among three areas, China, West Africa, and the USA. However, reports show that before 1918, past pandemics widespread in red surfaces regarding Fig. 6. Regarding the results of this study, we found that sunspots are the main cause of virus generation in the world. This research reveals that the biological and astrophysical mechanisms are related to the generation of world pandemics such as COVID-19. This study indicates that the emergence of pandemics like COVID-19 outbreaks would be linked with the extremum of the sunspot cycle. Potential mechanisms are explored by which sunspot behavior can affect human outbreaks. Furthermore, we proposed that the next peaks in the sunspot cycle could act as a potential alert for future pandemics.