Extreme Space-Weather Events and the Solar Cycle

Space weather has long been known to approximately follow the solar cycle, with geomagnetic storms occurring more frequently at solar maximum than solar minimum. There is much debate, however, about whether the most hazardous events follow the same pattern. Extreme events – by definition – occur infrequently, and thus establishing their occurrence behaviour is difficult even with very long space-weather records. Here we use the 150-year aaH\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$aa_{H}$\end{document} record of global geomagnetic activity with a number of probabilistic models of geomagnetic-storm occurrence to test a range of hypotheses. We find that storms of all magnitudes occur more frequently during an active phase, centred on solar maximum, than during the quiet phase around solar minimum. We also show that the available observations are consistent with the most extreme events occurring more frequently during large solar cycles than small cycles. Finally, we report on the difference in extreme-event occurrence during odd- and even-numbered solar cycles, with events clustering earlier in even cycles and later in odd cycles. Despite the relatively few events available for study, we demonstrate that this is inconsistent with random occurrence. We interpret this finding in terms of the overlying coronal magnetic field and enhanced magnetic-field strengths in the heliosphere, which act to increase the geoeffectiveness of sheath regions ahead of extreme coronal mass ejections. Putting the three “rules” together allows the probability of extreme event occurrence for Solar Cycle 25 to be estimated, if the magnitude and length of the coming cycle can be predicted. This highlights both the feasibility and importance of solar-cycle prediction for planning and scheduling of activities and systems that are affected by extreme space weather.


Introduction
Plasma, magnetic-field, and energetic-particle variability in the near-Earth space environment is collectively termed "space weather." In this study we focus on geomagnetic storms, global disturbances to the terrestrial system that are triggered by arrival of large-scale solarwind structures in near-Earth space (Gonzalez, Tsurutani, and Clúa de Gonzalez, 1999;Richardson, Cane, and Cliver, 2002;Schwenn, 2006;Kilpua et al., 2017). The most extreme geomagnetic storms are known to be driven by coronal mass ejections (CMEs) Extended author information available on the last page of the article (Gosling, 1993;Schwenn, 2006;Webb and Howard, 2012). Geomagnetic storms can have a number of adverse effects on space-and ground-based technologies, as well as posing a health hazard to humans in space or on high-altitude flights (Lockwood and Hapgood, 2007;Cannon et al., 2013). Greater understanding of space weather, both from a first-principles, physics-based approach and from empirical relations, can help mitigate such effects. This mitigation can be broadly divided into three forms.
Perhaps the simplest mitigation strategy is to use the known climatology of space weather to build systems with the appropriate resilience, i.e. engineer systems able to survive the expected number and intensity of storms over a system's lifetime. This does not require prediction of the timing of a space-weather event, but it does require knowledge of the maximum intensity that is likely to be encountered over a given extended period. Of course, building in such resilience comes at a cost, particularly for spacecraft hardware, and thus there is an incentive to not "over-engineer". Estimates of maximum intensity have been produced using theoretical arguments (e.g. Cliver and Dietrich, 2013;Sitnov et al., 2020), but the most reliable estimates come from statistical analysis of long-duration historic data sets (Riley, 2012;Riley and Love, 2017). The difficulty is the limited length of suitable homogeneous records. Thus advanced statistical methods, such as extreme-value theory (e.g. Embrechts and Schmidli, 1994), must be used to quantify the magnitude of events with recurrence times comparable to the record length. E.g. the magnitude of a 1-in-100-years event from geomagnetic-index data (e.g. Elvidge and Angling, 2018;Love, 2020;Chapman, Horne, and Watkins, 2020).
The cost of engineering systems to survive the most extreme storms can be reduced if systems can be protected by temporary action and the timing of damaging space-weather events can be reliably forecast with sufficient lead time (MacNeice et al., 2018). For geomagneticstorm prediction, forecast lead time is approximately one to four days, which is the typical travel time of a CME from Sun to Earth (Gopalswamy et al., 2001). The subsequent geoeffectiveness of a CME is particularly difficult to forecast, owing to the critical role of the internal magnetic-field structure (and/or that in the sheath region ahead of it). The out-ofecliptic component of the magnetic field is particularly important (Dungey, 1961), and this cannot be reliably determined until the CME arrives at Earth (e.g. Chen, Cargill, and Palmadesso, 1997;Kilpua et al., 2019).
Finally, there is the medium-term probabilistic forecasting, which sits between climatology and individual-event forecasting. For example, in terrestrial weather forecasting, it may not be possible to reliably forecast the time of individual thunderstorms, but the probability of a thunderstorm on any given day is much higher in Summer than Winter. Similarly, the probability of a moderate geomagnetic storm is well known to be higher at solar maximum than solar minimum. As the solar cycle is approximately 11 years long (although with a range spanning approximately 9 -14 years, e.g. van Driel-Gesztelyi and Owens, 2020), it is reasonable to assume that the probability of a moderate storm will be significantly higher in 2025 (likely close to solar maximum) than it is at present (2021, solar minimum). This kind of information is useful for long-term planning, e.g. scheduling of power-grid maintenance, space-mission launch dates, or planning end-of-life satellite de-orbiting to prevent the accumulation of "space junk". It is also useful for system design, such as space electronics, corroding pipelines, and deteriorating power transformers, wherein integrated damage matters in addition to the "knockout blow" of a major event.
In this article we briefly review the known trends in geomagnetic-storm occurrence (Section 2), before investigating whether trends established for more moderate storms also hold for more extreme events. This is done by statistical comparison to probability models (Section 4). The general approach takes a similar philosophy to extreme-value statistics, in that it is assumed that trends established at lower event intensities (but still within the tail of the distribution) can be used to understand the behaviour of the most extreme events.

Background
Space weather has long been known to follow the solar cycle, with a greater probability of space-weather events at times of solar maximum than at solar minimum. This was established using auroral records (Dalton, 1834) and, later, geomagnetic records (e.g. Sabine, 1852;Mayaud, 1975;Feynman and Crooker, 1978). Such empirical trends can be established for moderate space-weather events by simply looking at the frequency of occurrence. As more extreme events are considered, however, fewer events are available (by definition), and statistically establishing patterns in the occurrence times becomes ever more challenging. This has been aptly termed the "data paucity curse" (Sitnov et al., 2020); to define a 1-in-100-years event requires data covering several hundred years. Consequently, there has been much debate about whether the occurrence probability of severe space weather (e.g. events of magnitude that occur less than once in ten years) is influenced by the solar cycle.
A number of studies have recently revisited the issue of extreme-event occurrence and its relation to the solar cycle. The aa-index of geomagnetic activity (Mayaud, 1980) is derived from two geomagnetic stations and widely used for studying the occurrence of extreme storms, owing to its approximately 150-year duration. Kilpua et al. (2015) used aa with a range of storm thresholds and concluded that weaker storms occur most frequently in the declining phase, while stronger storms cluster near solar maximum. Vennerstrom et al. (2016) considered storms above a single threshold (threehourly aa > 300 nT, giving 105 events in a 142-year interval). Visually, there seems to be a reasonably strong correspondence between the occurrence of storms and the solar cycle. However, this is not quantified and they conclude that "storms occurred in all phases of the solar cycle, i.e. not just during solar maximum and in the declining phase where geomagnetic activity is usually strongest, but also frequently in the rising phase and even, at more than one occasion, close to solar minimum." Lockwood et al. (2018a,b) recently developed aa H , which is aa with additional processing to account for changes in the locations of the stations providing data and for secular changes in the Earth's magnetic field. Lockwood et al. (2019) used aa H to show that the top 100 and top 6 storms, defined using the maximum aa H -value, were clustered during years of elevated mean geomagnetic activity level. This suggests a solar-cycle ordering of extreme storms, as annual mean geomagnetic activity shows a strong solar-cycle trend.  examined both aa H and flare occurrence and concluded that there was solarcycle ordering for all thresholds of events considered. At higher event magnitudes, however, there does not appear to be a linear relation between sunspot number and storm occurrence. Instead, storm occurrence is bimodal, with a period of very low storm occurrence centred on solar minimum, and a period of higher storm occurrence centred on solar maximum. Chapman, Horne, and Watkins (2020) concluded that larger storms tend to occur at times of higher sunspot number (we note that this could be because there is a solar-cycle relation, or because they occur in large solar cycles, or both). Thus, on balance, there seems to be evidence that the occurrence probability of extreme storms is modulated by the solar cycle. However, in all these studies the very small sample size for the most extreme events makes it difficult to conclude whether the apparent solar-cycle ordering is simply a chance occurrence.
There is also debate about whether the magnitude of a solar cycle increases the likelihood of extreme storms. Vennerstrom et al. (2016) state that "strong activity cycles [high sunspot number] in general appear to foster more storms than the weaker cycles, but the pattern is not very clear, probably because [approximately 1-per-year magnitude] storms are relatively rare." Kilpua et al. (2015) looked in more detail at the correlation between the size of a solar cycle (in terms of the maximum sunspot number) and the occurrence rate of storms. They found that a positive correlation at low storm threshold, but the correlation declined with increasing threshold. Thus they concluded that "the quieter Sun can also launch superstorms." This is in agreement with the oft-quoted anecdote that the Carrington event (the superstorm of 1859) occurred during a relatively small solar cycle. However, due to the small sample size of extreme storms, one might expect the correlation to be weak and fail to meet standard tests of statistical significance.
We here test both these relations using models of storm-occurrence probability.

Data and Storm Selection
The solar-cycle timing and magnitude are determined using monthly sunspot number, available from www.sidc.be/silso/datafiles. The start of a solar cycle is identified using the discontinuous change in the average latitude of the sunspots, as outlined by Owens et al. (2011). These times are provided as part of the Github code and data repository: www.github.com/ University-of-Reading-Space-Science/ExtremeEvents. We further assume that Solar Cycle 25 began at the end of 2020. Using these solar-minimum times, solar-cycle phase is computed as zero at the start of the cycle and increasing linearly to unity at the end. Analysis of extreme events requires long, homogeneous records. Thus we use the longest contiguous record of geomagnetic activity: the aa-index (Mayaud, 1975). This has recently been rebuilt from individual station data, correcting for changes in station location and the long-term motion of the geomagnetic poles (Lockwood et al., 2018b,a). The resulting index [aa H ] is used here. The middle panel of Figure 1 shows aa H daily (red), 27-day (black), and annual (white) mean values.
A number of different definitions of storms have been proposed using the aa-(or aa H -) index (Kilpua et al., 2015;Vennerstrom et al., 2016;Haines et al., 2019). These methods generally use a threshold to define storm start and end times from the geomagnetic-index time series, but also permit short excursions below the threshold in order to avoid defining multiple events in quick succession. In this study, we are required to generate synthetic time series of storm occurrence many thousands of times at a range of thresholds (see Section 4). For this reason, we use a very simple storm definition. As storms typically have a duration of around one day, they are here defined by applying thresholds to calendar-day means of aa H . Thus storms split over calendar days will potentially be reduced in magnitude and/or storms from a single driver could produce multiple events. But in general there is very close correspondence between storms identified by thresholds applied to one-day means of aa H and the more complex methods, particularly for more extreme storms defined by higher aa H -thresholds, which are the focus of this study. For example, the top six storms defined by exceeding the 99.99 percentile of one-day aa H are the exact same events as the top six events reported by Lockwood et al. (2019), and we generate very similar occurrence statistics to Haines et al. (2019). Using one-day means also has the advantage of naturally focusing more on the solar generation of storm drivers, rather than internal magnetospheric processes. The UT of a storm onset influences the storm intensity (Lockwood et al., 2021). Thus in terms of understanding the variation of storms over the solar cycle, the UT variations can Figure 1 Time-series of the data used in this study. Top: Monthly sunspot number, with even-numbered solar cycles (defined from sunspot minimum to sunspot minimum, meaning the solar magnetic field flips approximately in the middle of the cycle) shaded grey. Middle: The aa H -record at daily (red), 27-day (black) and annual (white) resolution. Bottom: The occurrence probability of geomagnetic storms using different thresholds for storm magnitude. These 90-, 99-, 99.9-, and 99.99th percentiles of aa H are 37, 77, 165, and 290 nT, respectively, and correspond to approximately to 1-in-2-weeks, 1-in-3-months, 1-in-3-years, and 1in-25-years level events.
be considered a semi-random element and taking one-day means of aa H helps suppress this effect. Similarly, using daily averages of aa H also means that we are focusing on the more sustained, solar-wind-driven activity of geomagnetic storms, rather than isolated, intense substorm activity, which can often be independent of storms and may be more dependent on internal magnetospheric processes (Hajra et al., 2016).
The bottom panel of Figure 1 shows the fraction of days per year that meet a particular storm threshold. We refer to this property as the storm-occurrence probability. In this study we do not use a fixed threshold to define an "extreme event", as previous definitions have differed widely. Instead we use the term "extreme" in relative terms and refer specifically to the storm thresholds being used. Figure 1 shows four storm thresholds in the daily mean aa H : 37, 77, 165, and 290 nT, which are the 90-, 99-, 99.9-, and 99.99th percentiles, respectively, of the whole 1868 -2018 sequence. These thresholds result in 5479, 548, 55, and 6 storm days, respectively. These thresholds therefore correspond to approximately 1in-2-weeks, 1-in-3-months, 1-in-3-years, and 1-in-25-years level event. It is clear from the time series that the storms using the lowest threshold (90th percentile, in blue) show strong solar-cycle variations; both in terms of waxing and waning with solar minimum and maximum, and in terms of smaller solar cycles showing fewer storms. At more extreme storm thresholds, however, this is not immediately obvious. Figure 2 shows a super-posed epoch analysis of sunspot number (panel a) and storm occurrence (panel b) over the solar cycle. Note that we normalise for the variable solarcycle length by considering solar-cycle phase (i.e. 0 at start of a cycle, to 1 at the end). The mean sunspot-number variation peaks around 0.35 solar-cycle phase, with an extended decay, as expected (e.g. van Driel-Gesztelyi and Owens, 2020). Storm occurrence is ordered by the solar cycle, but it does not show the same temporal profile as sunspot number with solar-cycle phase. For the 99th percentile storms and above, there appear to be two distinct modes of storm occurrence: an active phase centred on solar maximum and spanning solarcycle phase 0.18 to 0.79, and a quiet phase at other times. This is in broad agreement with the conclusions of . At the 90th and 99th percentiles, it seems fairly self-evident that the increased occurrence of storms during the active phase is not simply due to random chance (though this will be tested later). At the 99.9th and 99.99th percentiles, Figure 3 Models of storm-occurrence probability. Top: Time series of the relative probability for the four models of storm occurrence used in this study. The Random model (blue) has equal probability of a storm at all times. The Phase model (red) has a factor of six greater probability during active phase than quiet phase. The Phase+Amp model (black solid) has active-phase probability proportional to the mean sunspot number for the cycle. The EarlyLate model is the Phase+Amp model with modified probability in the early and late stages of the active phase depending on whether the cycle number is odd or even. Bottom: The associated cumulative distribution functions, used to generate model time series of storm occurrence. however, there are sufficiently few events that the possibility of random storm occurrence cannot be readily discounted.
Figures 2c and d show storm occurrence in odd and even cycles. For the 99th percentile storms, the active and quiet phases are both present at approximately the same solar-cycle phases. However, at higher thresholds, odd and even cycles show divergent behaviour. Even cycles show storms late in the active phase, while odd cycles show storms early in the active phase. As the data have been split in two compared with Figures 2b, there are even fewer events and thus this separation being simply due to poor statistics may seem to be a plausible explanation. This will be tested statistically.

Modelling Storm Occurrence
In order to test the significance of the apparent trends reported in Section 3, we construct a number of models of storm-occurrence probability, shown in Figure 3. Each model is chosen to test one particular aspect of the observed storm-occurrence times, and/or to serve as a null hypothesis against which other models can be tested: i) The "Random model" (blue line) assumes storms occur completely randomly and thus have equal probability to occur at any time in the period of consideration. Therefore the relative probability is constant at all times, as shown by the blue line in the top panel of Figure 3, which has been normalised to give total probability of one over the whole interval. ii) The "Phase model" (red shaded area) assumes that the relative probability during active phases of the solar cycle (i.e. phase between 0.18 and 0.79) is a factor of nine greater than during the quiet phase. See Section 5 for more info. iii) The "Phase+Amp model" (black solid line) modifies the Phase model by adjusting the probability in the active phase according to the amplitude of the solar cycle. The cycle amplitude [A] is taken to be mean sunspot number over the whole cycle. The probability during the active phase is scaled by a factor 1.5 A/<SSN>, where <SSN> is the mean sunspot number over the whole interval covered by the aa H -data set. See Section 6 for more info. iv) The "EarlyLate model" (dashed line) further modifies the Phase+Amp model to account for the odd/even cycle variation. For even-numbered cycles, it increases the storm probability by 60% in the early period of the active phase and decreases probability by 60% in the late period of the active phase. For odd-numbered cycles, probability is decreased in the early phase and decreases in the late phase. See Section 7 for more info.
In order to generate a time series of storm occurrence consistent with any probability time series, we first construct the cumulative probability time series, shown in the bottom panel of Figure 3, by summing the relative probabilities up to a given time. Individual storm times are then created using a random-number generator between zero and one, and selecting the time closest to that cumulative probability. To generate a time series of storms at the, e.g., 99.9th percentile, this process would be performed 55 times, checking that no two storms have the same time (if so, one of the storms is "redrawn" from the probability time series).
Over the following sections, these models are used to address three questions relating to storm occurrence: i) Are extreme events ordered by the solar cycle?, ii) Do bigger cycles produce more extreme events?, and iii) Do extreme events behave differently in odd and even cycles?

Are Extreme Events Ordered by the Solar Cycle?
The first aspect to be tested is the existence of apparent active and quiet phases of the solar cycle. The black dots in Figure 4 show the occurrence probability of storms defined by increasing aa H -thresholds during the quiet phase (top), the active phase (middle), and the difference between the two (bottom). The model to be tested is the Phase model, with the Random model providing the null hypothesis: that there is no underlying difference in the occurrence of storms in these two periods and the observed difference occurred purely by chance.
The primary aim here is to test for the existence (or otherwise) of a solar-cycle variation, rather than quantify the amplitude of this variation. However, in the Phase model we set the active-phase probability to be a factor nine higher than the quiet phase. This is the value that matches the observed storm occurrence at a threshold of 135 nT (which gives 100 events, deemed to be sufficient to be statistically meaningful, but still capture behaviour of more extreme storms). We then test whether this same active/quiet ratio is able to describe the behaviour of the most extreme storms, accounting for the small sample size.
Time series of storm occurrence are produced for different aa H -thresholds (and hence number of storms) using the Random and Phase models, and from these we compute the storm-occurrence probability during active and quiet phases, in the exact same way as is done for observations. As a random-number generator is used to produce the model time series, we can repeat this process multiple times to produce multiple realisations of time series consistent with the underlying probability sequence. This allows us to quantify the range of plausible properties (this is often referred to as a "Monte Carlo" approach). The solid lines in Figure 4 show the median of 5000 iterations, while the coloured bands span one-and two-σ ranges. It is clear that the Random model overestimates storm occurrence during the quiet phase and underestimates during active phase. Looking at the difference between active and quiet storm occurrence, the Random model can be dismissed at the two-σ level for all storm thresholds. Thus the null hypothesis, that storm occurrence is random through the solar cycle, can be rejected at the two-σ level.
The Phase model shows good agreement with the observations above the 99th percentile. Below the 99th percentile, the observations deviate from the Phase model towards the Random model. This could be explained by there being an additional contribution to storms at low thresholds that occurs more randomly throughout the solar cycle than the more extreme storms. This is consistent with both CMEs and stream-interaction regions (SIRs) driv- ing storms at lower thresholds (e.g. at lower thresholds, storms show a greater degree of 27-day recurrence: Haines et al., 2019), whereas larger storms are entirely CME driven (e.g. Richardson, Cane, and Cliver, 2002).

Do Bigger Cycles Produce More Extreme Events?
The second aspect to be tested is the relation between the magnitude of a solar cycle and occurrence of storms within that cycle. Figure 5 shows scatter plots of the average storm occurrence probability per cycle and the cycle magnitude (mean sunspot number), for increasing storm thresholds. Each scatter plot shows the linear correlation coefficient [r] and the probability [p] that the null hypothesis of zero correlation can be rejected. For all storm thresholds shown, the null hypothesis can be rejected at the 95% (two-σ ) confidence level. However, at the 99.99 percentile threshold of aa H , there are only six storms and thus a preponderance of cycles with zero storms. Hence the chance occurrence of a single storm could significantly change the correlation. This will be investigated more quantitatively using the probability models. Figure 6 shows r for a greater range of aa H -thresholds. Peak correlation occurs around the 99th percentile, suggesting that there is an additional contribution to lower-threshold storms that is not as strongly ordered by the cycle amplitude. Above the 99th percentile, correlation generally decreases with storm threshold. But it is not clear from the observations alone whether this is simply the result of poorer statistics from fewer events, or whether it indicates that the occurrence of more extreme storms is genuinely less influenced by the magnitude of the solar cycle. To test this, we use the Phase+Amp model, with the Random model as the null hypothesis. (Note that the Phase model produces identical results to the Random model in this instance.) As in Section 5, the primary aim is to determine whether there exists a correlation between cycle amplitude and storm occurrence for the most extreme storms, rather than quantify the relation. We again use a threshold of 135 nT in daily mean aa H to find the top 100 events and find that the best linear-fit results in a scaling of 1.5 of storm-occurrence probability with normalised cycle amplitude (i.e. A/<SSN>, where A is the cycle mean sunspot number and <SSN> is the mean sunspot number over 1868 -2018). This is used to define the Phase+Amp model. We test whether this same relation also applies to more extreme events.
For storms below the 99.9th percentile of aa H , the null hypothesis can be rejected, while the Phase+Amp model (which gives a linear correlation between storm occurrence and cycle amplitude) agrees well with the data. For storms above the 99.9th percentile, the null hypothesis can only be rejected at the one-σ level (except for the six events at the 99.99th percentile, where it can be rejected at two-σ ), although the Phase+Amp model still provides the better match to the data. We can also conclude that the decline in correlation with storm threshold is entirely consistent with fewer events; no change in the underlying relation between storm occurrence and cycle magnitude is necessary to explain the lower correlation at higher storm threshold. Figure 7 shows a similar analysis to Figure 4, but it compares storm occurrence during the early and late active phases, rather than the active and quiet phases. Data have also been split into even-and odd-numbered cycles. Given that the early and late active phases only comprise approximately a quarter of the data set, and separating the data into odd and even further halves those data, statistics are expected to be poorer for this analysis than previous sections. For low storm thresholds, there is little difference in storm-occurrence probability between the early and late active phases. The difference becomes apparent from the 99.9th percentile upwards in odd cycles, and from the 99th percentile upwards for even cycles. Thus we use the observed storms for the 99.9th percentile (55 events) to set the EarlyLate model probability to be modified by the observed value of 60%.

Do Extreme Events Behave Differently in Odd and Even Cycles?
Here we use the Phase+Amp model as the null hypothesis, as that has no difference in storm occurrence between the early and late active phases. The EarlyLate model shows the observed trends in storm occurrence. Looking at the difference between early and late active phases, the null hypothesis can be independently rejected for both odd and even cycles.

Solar Cycle 25
The three trends reported in the three previous sections can be combined to produce a forecast of the probability of storm occurrence, if the magnitude and length of the coming solar cycle is assumed. A number of different predictions for the magnitude of Solar Cycle 25 have been made (Nandy, 2021). Rather than highlight any one particular forecast, we consider plausible scenarios based on recent solar cycles. Figure 8 shows the storm-occurrence probability assuming three Solar Cycle 25 scenarios: Thick red lines show a "small" cycle (solar-cycle magnitude and length equal to Solar Cycle 12), black dashed lines show a "moderate" cycle (equal to Solar Cycle 23), and blue lines show a "large" solar cycle (equal to Solar Cycle 19). For storms exceeding the 99th percentile of daily aa H , we only need consider the solarcycle phase and solar-cycle amplitude rules. The solar-cycle amplitude rule produces a factor of three difference in peak storm-occurrence probability between the largest and smallest cycles considered.
For more extreme storms, such as those exceeding the 99.99th percentile of daily aa H , the odd/even rule also needs to be considered. As the coming cycle is odd numbered, all three Solar Cycle 25 scenarios give peak activity late in the active phase, which is expected to begin in early 2026.

Summary
This study has used the aa H -index of global geomagnetic activity to quantify the relation between occurrence of extreme geomagnetic storms and the solar cycle. Previous studies have shown that moderate geomagnetic storms both follow the solar cycle and are more prevalent in larger solar cycles, but they have concluded that these relations do not apply to the more extreme events. Here, we have constructed models of storm-occurrence probability to show that the relations simply break down because of poor statistical sampling, and that the probability of extreme-storm occurrence follows both the approximately 11-year solar cycle and variations in solar-cycle amplitude. We further report on apparent differences in the timing of storms during odd-and even-numbered solar cycles. These trends appear to hold for 99.99th percentile of storm intensity; the highest threshold that it is reasonable to statistically assess with the 150-year aa H -data set, and which corresponds to approximately the 1-in-25-years event magnitude.
At lower storm thresholds, there is clear evidence for a solar-cycle variation in the probability of storm occurrence. However, this variation is bimodal, switching on around a solarcycle phase of 0.18 and off at 0.79, rather than following the more continuous amplitude variation exhibited by sunspot number. This is in broad agreement with . The difficulty in drawing conclusions for the most extreme events is that, by definition, extreme events are rare and so the statistics are poor. However, by comparing the available observations with statistical models of storm occurrence, we find that the storm occurrence is more randomly distributed through the solar cycle for smaller storms (below the 99th percentile of aa H ), while more extreme storms are constrained to solar-cycle phases between 0.18 and 0.79. Our analysis shows that this is a significant change in behaviour, rather than just an effect of reduced sample size. We suggest that the smaller storms that occur outside the active phase of the solar cycle are the result of stream-interaction regions, rather than coronal mass ejections. This is in agreement with the finding that smaller storms show a greater degree of 27-day recurrence than larger storms (Haines et al., 2019), as many SIRs corotate with the Sun (hence co-rotating interaction regions, CIRs, are a large part of the set of SIRs, but not all SIRs are co-rotating). While CMEs approximately follow the sunspot-number variation (e.g. Yashiro et al., 2004), SIRs are more prevalent in the declining phase of the solar cycle and at solar minimum (e.g. Richardson, Cane, and Cliver, 2002). Thus the inclusion of SIR-driven storms acts to reduce the net solar-cycle trend of storm occurrence at low thresholds.
Despite the declining number of events, the null hypothesis (that storms occur randomly throughout the solar cycle) can be dismissed at the 95% confidence level right up to the most extreme events considered (days exceeding the 99.99th percentile of aa H , which gives six events in 150 years). The available observations are consistent with event-occurrence probability in active phases being around nine times higher than quiet phases, across all event thresholds above the 99th percentile.
Next we examined the relation between solar-cycle amplitude (in terms of mean sunspot number across a cycle) and storm-occurrence probability. As there are only 14 complete cycles in the period of study, statistics are relatively poor. Furthermore, as storm intensity increases, an increasing number of cycles will contain zero storms. Nevertheless, by comparing with statistical models, a number of aspects can be concluded. Firstly, there is a nearperfect correlation between cycle amplitude and storm occurrence for storms at the 99th percentile of aa H (548 storms). For weaker storms, the correlation likely drops because of the inclusion of SIR-driven storms, which are not expected to increase in occurrence during large solar cycles. For stronger storms the correlation coefficient also falls, and for storms above the 99.9th percentile the null hypothesis (that there is no correlation between solar-cycle amplitude and storm occurrence) cannot generally be rejected at the 95% confidence level. This was also the conclusion of Kilpua et al. (2015). However, this decrease in observed correlation is exactly as expected from the reduced sample size with a perfect underlying correlation between cycle amplitude and storm occurrence. Thus the simplest explanation of the available data is that cycle amplitude and storm occurrence are correlated for all storm intensities. This is in agreement with the correlation for the most extreme storms, at the 99.99th percentile of aa H . Here, the null hypothesis of zero correlation can be rejected at the 95% level. The linear correlation coefficient is 0.63, which for N = 14 is significantly different from zero correlation above the 95% confidence level. However, note that there are only six storms at this level of intensity, and consequently nine of the cycles contain zero storms.
The final property that we consider is the apparent difference in the occurrence of extreme storms in odd-and even-numbered solar cycles. In even-numbered cycles, large storms are generally confined to the early half of the active phase, whereas in odd-numbered cycles, they are generally in the later half of the active phase. Indeed, for the 99.99th percentile storms, the three events during even cycles are all in the early half of the active phase, while the three events during odd cycles are all in the later half of the active phase. Assuming an equal probability of storms occurring in early or late active phase, the probability of the observed difference occurring by chance is p = 0.5 6 = 0.016. Using the statistical models to Figure 9 Schematic of CME and large-scale solar magnetic-field polarities over the Hale cycle. Positive (outward) polarities are blue, negative (inward) polarities are red. Top: At all times, magnetic flux emerges with left-handed helicity in the northern hemisphere, and right-handed flux in the southern hemisphere. Middle: According to Hale's law, throughout odd cycles (left) leading sunspots have negative polarity in the northern hemisphere and positive in the southern hemisphere. Thus northern/southern hemisphere sunspot loops and associated CMEs have westward/eastward axial fields, shown by the approximately horizontal arrows. This is reversed for even cycles (right, grey shading). Bottom: Odd cycles start with negative polarity in the northern polar cap and positive in the southern polar cap. This reverses around solar maximum, approximately midway through the solar cycle. The opposite trends are present in even cycles. Extreme-event probability is found to be enhanced in the late phase of odd cycles and the early phase of even cycles (red box), suggesting the large-scale polarity plays a role. look at the relation in more detail draws similar conclusions for the largest events (> 99.9th percentile of aa H ).
Putting these empirical results together allows the probability of extreme geomagnetic storms to be quantified if the timing and magnitude of the coming solar cycle is known. Within plausible ranges (Pesnell, 2020;Nandy, 2021), the amplitude of the solar cycle can change the occurrence probability of an, e.g., 1-in-100-years event by about a factor of three. E.g. for a large cycle, like Solar Cycle 19, the integrated probability of at least one 99.99th percentile storm (approximately a 1-in-25-years event) over the next 11 years is about 54%, whereas it drops to approximately 24% during a small cycle, like Solar Cycle 12. But the timing within the cycle has an even bigger effect, with late active phase being up to an order of magnitude more likely to produce an extreme geomagnetic storm than the quiet phase. This stresses the importance of solar-cycle prediction for long-term planning and scheduling of systems affected by extreme space weather.

Discussion
Both the cycle phase and cycle amplitude trends in storm occurrence are consistent with the source of more extreme space weather being stronger and more complex solar magnetic fields (e.g. Lefevre et al., 2016): sunspot number and area increase both within the solar cycle, peaking around solar maximum, and are larger in large cycles (e.g. van Driel-Gesztelyi and Owens, 2020). Sunspots are also known to be a reasonable proxy for active region and CME occurrence (e.g. Owens et al., 2008). During the active phase, the increased rate of activity is likely to further amplify the intensity of activity, through reduced waiting time between events and hence increased magnetospheric preconditioning. However, the fact that storm occurrence is bimodal with solar-cycle phase, rather than following the quasisinusoidal sunspot-number variation, suggests that the relation is highly non-linear and/or there are additional factors influencing extreme-storm occurrence.
The switch in extreme-storm occurrence during the early and late active phases between odd and even cycles supports the idea of additional controlling factors. Figure 9 shows the trends expected in large-scale coronal fields and the internal structure of coronal mass ejections. As extreme geomagnetic storms require strong, persistent out-of-Ecliptic (southward) magnetic field, perhaps the most obvious candidate for influencing extreme geomagneticstorm occurrence is trends in the internal magnetic field structure of CMEs. Bothmer and Rust (1997) showed that, at least in a statistical sense, flux-ropes associated with CMEs follow Hale's law of sunspot polarities. This means that CME flux-rope polarities are opposed in odd and even cycles, and change in phase with the solar cycle (Lynch et al., 2005). However, there is a lack of obvious difference in the geoeffectiveness of CMEs of differing polarities (Fenrich and Luhmann, 1998;Kilpua et al., 2012). Furthermore, we do not find a difference in storm occurrence averaged over odd and even cycles, only in the timing of storms within odd and even cycles. Thus, while internal fields of CMEs are likely critical to extreme geomagnetic storms, the Hale-cycle trends reported by Bothmer and Rust (1997) do not obviously explain our findings.
Approximately annual bursts of solar activity have been interpreted in terms of the interaction of latitudinal bands of solar magnetism (McIntosh et al., 2015). This phenomenon also does not appear to be related to the odd/even cycle trends: It is observed in much more moderate activity and is related to shorter time-scale variability than the early/late activity phases reported here. Additionally, no difference in the behaviour of activity bands in odd/even cycles has been reported.
Instead, we consider the polarity cycles of the Sun, which run from solar maximum to solar maximum. Enhanced storm occurrence is present when there is dominant positivepolarity flux in the northern hemisphere and negative in the southern. In the galactic cosmicray community, this configuration is referred to as a qA > 0 cycle, with the reverse being qA < 0. During qA > 0, the large-scale coronal magnetic field is, on average, southward near the heliographic Equator. We suggest that fast CMEs drag out this overlying field, which further adds to their existing geoeffectiveness. While this is not expected to be a large effect, it may be just sufficient to tip an already severe event (or series of events) over into the more extreme category, providing "perfect storm" conditions. It is unclear whether the overlying fields would manifest as additional flux-rope field or form part of the CME sheath field, which can often be geoeffective in their own right (Tsurutani et al., 1988;Owens et al., 2005). We further note that enhanced heliospheric magnetic-field strengths at Earthorbit have been reported in qA > 0 cycles compared with qA > 0 ( Thomas, Owens, and Lockwood, 2013). This could also lead to increased geoeffectiveness of sheath fields.
Finally, we note that two of the best-known extreme space-weather events are not present in the aa H -data set; the September 1859 "Carrington event" and the July 2012 solar storm that passed STEREO A (Baker et al., 2013). Both of these events occurred in even-numbered cycles (Solar Cycles 10 and 24, respectively) and were roughly in the centre of the early active phase (with solar-cycle phases of 0.30 and 0.29, respectively). Thus, the trends established using the aa H -data set hold for the most extreme events independent of the database investigated.