Reconstruction and Prediction of Variations in the Open Solar Magnetic Flux and Interplanetary Conditions

Historic geomagnetic activity observations have been used to reveal centennial variations in the open solar flux and the near-Earth heliospheric conditions (the interplanetary magnetic field and the solar wind speed). The various methods are in very good agreement for the past 135 years when there were sufficient reliable magnetic observatories in operation to eliminate problems due to site-specific errors and calibration drifts. This review underlines the physical principles that allow these reconstructions to be made, as well as the details of the various algorithms employed and the results obtained. Discussion is included of: the importance of the averaging timescale; the key differences between “range” and “interdiurnal variability” geomagnetic data; the need to distinguish source field sector structure from heliospherically-imposed field structure; the importance of ensuring that regressions used are statistically robust; and uncertainty analysis. The reconstructions are exceedingly useful as they provide calibration between the in-situ spacecraft measurements from the past five decades and the millennial records of heliospheric behaviour deduced from measured abundances of cosmogenic radionuclides found in terrestrial reservoirs. Continuity of open solar flux, using sunspot number to quantify the emergence rate, is the basis of a number of models that have been very successful in reproducing the variation derived from geomagnetic activity. These models allow us to extend the reconstructions back to before the development of the magnetometer and to cover the Maunder minimum. Allied to the radionuclide data, the models are revealing much about how the Sun and heliosphere behaved outside of grand solar maxima and are providing a means of predicting how solar activity is likely to evolve now that the recent grand maximum (that had prevailed throughout the space age) has come to an end.

geomagnetic activity observations has been detailed in three excellent reviews by Cliver (1994aCliver ( ,b, 1995. The number of available magnetic observatories subsequently grew gradually over the next century, helped by international campaigns such as the Polar Year (1882 -1883) and the Second International Polar Year (1932Year ( -1933, so that by 1955 about 100 stations worldwide were supplying regular routine observations. This number rose rapidly because of the International Geophysical Year, IGY (1957, reaching of order 170 by 1960 (Jankowski and Sucksdorff, 1996). Figure 1 shows the global distribution of stations known to be operating in 1996.

The space age
Modern understanding of geomagnetic activity relies heavily on in-situ spacecraft observations of the solar wind, shortly before it impacts on the Earth. Such measurements were first made routinely in 1963 but the monitoring was not close to continuous until about 1966. After a few years the length and number of data gaps in this vital space science resource began to increase , driven by factors such as telemetry limitations and a shortage of available tracking stations. In this respect, 1995 is a significant date in that (almost completely) continuous solar wind monitoring began with the WIND spacecraft and has continued with the ACE spacecraft to the present day. Covering almost two solar cycles, these continuous data constitute the most valuable resource we have for understanding how solar wind properties, including its bulk flow speed, SW , and the interplanetary magnetic field (IMF) embedded within it, , drive geomagnetic activity. The near-Earth interplanetary data have been collected by NASA's Goddard Space Flight Centre (the Space Physics Data Facility) into the OMNI and OMNI2 datasets (Couzens and King, 1986;King and Papitashvili, 2005).
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Variations in the Open Solar Magnetic Flux 7 2 Geomagnetic Indices A large number of indices have been developed and deployed to quantify the geomagnetic activity detected by the global network of magnetic observatories. These indices vary in which observatories are used, which data from those observatories are used, how those data are processed, and how the data from different observatories are combined together. As a result, the different indices monitor different parts of the system of coupled currents that flow in near-Earth space in response to the flow of the magnetised solar wind plasma around the magnetosphere. Figure 2 is a schematic showing the currents that flow in the magnetosphere-ionosphere system and how they are connected.
The magnetic field at a point is the summed effect of all moving charges particles in the cosmos on that point. Because the Biot-Savart law contains an inverse-square dependence on the distance between the moving charges and the point in question, the effects of closer currents tend to dominate over more distant ones but all contribute. As a result, although the deflections seen by a ground based magnetometer usually reflect changes in the closer large-scale currents in the magnetosphere-ionosphere system, there will also always be some effects of other currents flowing elsewhere. The following subsections briefly outline indices that will be employed in this review. In some relatively clear-cut cases, such as , , and , there is discussion of the currents in near-Earth space which contribute most to the detected variations in the index. However, for other cases the combination of currents that the index is monitoring is not so straightforward, as will be discussed in Section 6. Section 2.1 lists standard indices in widespread use whereas Section 2.2 discusses some research indices, designed for reconstruction work using historic datasets. Section 2.3 presents an initial study of how these various indices vary with parameters describing near-Earth interplanetary space.

The Dst index
The (Disturbed Storm Time) index is constructed using hourly means of the horizontal component measured at four equatorial magnetometer stations: Honolulu, San Juan, Hermanus, and Kakioka. The index was first constructed for the International Geophysical Year and is available for 1957 onwards. The derivation and station selection is described by Sugiura and Kamei (1991). Because of the low latitudes of the stations, the index chiefly monitors the disturbances produced by changes in the ring current, which flows westward around the magnetosphere at geocentric distances of about 3 -6 (where 1 is a mean Earth radius), as shown in part (b) of Figure 2. Negative perturbations in Dst correspond to storm time enhancements in the ring current. However there are also small contributions from the cross-tail sheet current in the magnetotail and some contamination from auroral ionospheric currents. In addition, positive variations in are caused by the compression of the magnetosphere due to solar wind dynamic pressure increases, showing that it also responds to changes in the magnetopause (Chapman-Ferraro) currents.

The AU and AL indices
The Auroral Electrojet indices ( , , , and ) were first introduced by Davis and Sugiura (1966) to a measure the auroral electrojet currents that flow in the high-latitude ionosphere. In order to achieve this, the stations contributing to the index lie within the band of the auroral oval. A ring of 12 longitudinally-spaced magnetometers ensures that one station is always close to the peak of the westward auroral electrojet whilst another station is always close to the peak of the eastward electrojet (Tomita et al., 2011). The stations are all in the northern hemisphere and a corresponding southern hemisphere ring is precluded by the southern oceans which do not allow sufficiently even and full longitudinal coverage of the southern auroral oval. The exact number

c).
Figure 2: Simplified schematic of the currents flowing in the magnetosphere-ionosphere system in the northern hemisphere (southern hemisphere currents are omitted for clarity). Part (a) shows (in orange) segments of the Chapman-Ferraro currents that flow in the magnetopause and separate the geomagnetic and (shocked) interplanetary fields: the relevant segments are at the sunward edge of the magnetospheric tail and flow from dusk to dawn (see also Figure 14). These connect to the high-latitude ionosphere via the Region 1 field-aligned (Birkeland) currents (shown in blue). The Region 2 field-aligned currents are needed to maintain ionospheric current continuity and because of the incompressibility of the ionosphere (in the sense that the magnetic field there is essentially constant). As shown in red in (b), these Region 2 currents close via the ring current that flows westward around the Earth in the inner magnetosphere (in cyan), caused by the gradient and curvature drifts of trapped energetic particles. Part (c) shows the Region 1 and 2 currents entering and leaving the polar E-region ionosphere and how they connect to the Pedersen currents there (in green), which flow in the direction of the electric field. The paired up and down fieldaligned currents transfer solar wind energy, momentum, and electric field down into the ionosphere as well as current (see review by Lockwood, 1997). The Hall currents (shown by black lines) flow perpendicular to the electric field (and so cause no energy dissipation) and are antiparallel to associated ionospheric flow (convection) in the over-lying F-region ionosphere. For a uniform spatial distribution of conductivities, the effects of field-aligned and Pederson currents cancel beneath the ionosphere and only the Hall currents are detected by high-latitude magnetometers on the ground. The formation of the westward electrojet in the substorm current wedge (shown here in mauve) is described later by Figure 14. In this electrojet, a highly conducting channel is formed by ionisation generated by the associated particle precipitation and the Cowling conductivity is relevant.
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Variations in the Open Solar Magnetic Flux 9 of stations has varied somewhat over time, data prior to 1964 coming from a somewhat different distribution of stations including contributions from the southern hemisphere. The indices can be generated using a great many stations, but the standard indices employ 12 and are referred to as (12). They are recorded at high time resolution (usually 2.5 minutes) and quiet time diurnal variations are first removed from the component. The maximum and minimum values of the background-subtracted at any one time seen by the ring stations are the and values, respectively. (Also often quoted are , the difference between and , and , which is their mean, but neither are used in this review). The auroral currents causing geomagnetic activity are divided into the "DP1" and "DP2" systems (e.g., Clauer and Kamide, 1985). Studies of the station contributing the maximum deflection (e.g., Tomita et al., 2011) reveal that large (negative) perturbations to are caused by the nightside westward electrojet (the DP1 or substorm current wedge system, see Section 6) which responds to magnetic energy that is stored in the magnetotail and then explosively released into the westward auroral electrojet during events called "substorm expansion phases", whereas is set by the eastward part of the dayside DP2 currents that are directly driven by the solar wind (e.g., Clauer and Kamide, 1985;Consolini and De Michelis, 2005). Under quiet conditions reflects the westward electrojet of the DP2 system in the morning sector (DP2 currents are generally detected on the dayside where ionospheric conductivities are higher). The eastward, quiet westward and disturbed westward auroral electrojets are all labelled in part (c) of Figure 2. There is some contamination of and from the ring current. Data are available for 1957 onwards.

The aa index
The index was devised and compiled by Mayaud (1971Mayaud ( , 1972Mayaud ( , 1980. It is a "range" index, meaning it is based on the range of variation seen during three-hour intervals, as introduced by Bartels et al. (1939). At each station contributing to the index, a semi-logarithmic index is derived by first removing the quiet-time variation and then using the larger of the differences between the maximum and minimum values of either the horizontal or vertical field in the 3-hourly intervals (the range), giving eight values per day. Data are taken from just two mid-latitude stations selected to be close to antipodal, with the northern hemisphere station in southern England and the southern in Australia. In both hemispheres, three different stations were needed to give a continuous index: in the north they are Greenwich (1868Greenwich ( -1925, Abinger (1926-1956), and Hartland (1957 and in the south they are Melbourne (1868Melbourne ( -1919, Toolangi (1920-1979), and Canberra (1980. The indices are generated using a site-dependent scale to normalise them to the values seen at the Niemegk station, giving and for the north and south hemispheres and is defined as the arithmetic mean of the two.

The Am, An, and As indices
The , , and indices are range indices constructed in the same way as but use a greater number of stations. Mid-latitude stations (around a target geomagnetic latitude of 50°), spread across geomagnetic longitudes in both the northern and southern hemispheres, are used. The exact mix of stations has varied somewhat over time, as have the longitudinal sectors into which they have been divided, but typically number 16 in the north and 9 in the south. The values are averaged over the longitudinal sectors (5 in the north, 4 in the south) before being normalised and then averaged over the northern hemisphere, southern hemisphere and globally to give , , and , respectively. Data are available from 1959.

The Ap index
The index is another range index, which is available for 1932 onwards. It is a 3-hourly planetary index compiled using the indices from 11 -13 longitudinally-spaced mid-latitude stations in the northern hemisphere.

Specialist geomagnetic indices
The , , , , , , and indices are all well-established, in widespread use and formally recognised by international organisations such as IAGA (the International Association of Geomagnetism and Aeronomy). However there are other valuable indices that have been compiled by individual researchers to meet specific purposes. These generally employ hourly mean or hourly "spot values" (samples).

The u index
The index was developed by Bartels (1932). It was based on the absolute value of the difference between the mean values of for a day and for the preceding day. Taking this difference is a simple but effective way of removing quiet time variation. The index is the weighted mean of data from a collection of stations. Prior to averaging the data from the various stations, each was normalised to the magnetic latitude (Λ) of Niemegk using an empirical 1/ cos(Λ) dependence. Bartels used data from Seddin (1905-1928, Potsdam (1891Potsdam ( -1904, Greenwich (1872Greenwich ( -1890, Bombay (1872Bombay ( -1920, Batavia (1884Batavia ( -1899Batavia ( and 1902Batavia ( -1926, Honolulu (1902Honolulu ( -1930, Puerto Rico (1902Rico ( -1916, Tucson (1917-1930), and Watheroo (1919-1930. He notes stability problems with the Greenwich data in deriving interdiurnal variation data (from one day to the next) and ascribes half weighting to it as a result. (Recently, Lockwood et al., 2013a have studied all the hourly data from Greenwich and confirmed these problems). In addition Bartels notes many data gaps in the Bombay data. The index is based on 2 stations for 1872 -1891 (Greenwich and Bombay and Bartels expresses reservations about the quality of both), rising to 6 by 1919 before falling to 3 again by 1930. Data for 1835 -1872 was compiled by Bartels and is called the index but is not the same as the index after 1872. Bartels notes that before 1872, no proper data to generate an interdiurnal index was available to him and so other correlated measures of the diurnal variation are used as proxies. Bartels himself stresses that the values before 1872 are "more for illustration than for actual use". The index was criticised at the time for failing to register the recurrent geomagnetic storms and, as a result, he himself developed the range indices as an alternative (Bartels et al., 1939). However, as pointed out by , this feature is a positive advantage of as it means that it is not complicated by a response to solar wind speed variations. The index data cease in 1930.

The IDV index
The index is a variant of the index that was devised by . The main difference in its derivation is that instead of using daily mean values of , the hourly mean (or spot value) closest to solar local midnight is employed. As for , the difference between values on successive days is taken. The latitude normalisation is also slightly different, using an empirically-derived 1/ cos 0.7 (Λ) dependence.
is found to not depend on solar wind speed, SW and depends on just the IMF field strength in annual means. One of the great advantages of is that it's compilation is much simpler than the range-based indices and this has allowed the use of historic hourly mean (or spot) values to produce a meaningful index that extends back many years.  adopt a different philosophy in compiling to that adopted by Mayaud (1971) in compiling . Mayaud's philosophy was to use as homogeneous a data series as possible. The philosophy of  (and of Lockwood et al., 2006b in the derivation of the index, see below) was to use all available data that are of sufficient quality. Inevitably this means that fewer data are available at earlier times and the construction of means that (like ) it is not homogeneous. Svalgaard and Cliver (2010) added more stations and also extended the sequence back to 1835 using a linear correlation with the Bartels index. In this context, note Bartels' reservations about the early data discussed in Section 2.2.1.

The IDV(1d) index
The (1 ) index has recently been introduced by Lockwood et al. (2013a). This is very similar to with two differences. The first is that it employs daily means rather than the nearmidnight hourly mean or spot values: in other words, Lockwood et al. (2013a) returned to the formulation used by Bartels (1932) to generate . This means that 24 times the volume of data are used than in generating because data from the 23 UT-hours away from local midnight are not discarded. This has advantages in noise suppression by averaging. Lockwood et al. (2013a) adopted the name (1 ) (rather than reverting to the name ) because of the other major difference, namely that the (1 ) composite is homogeneous in its construction (i.e., it uses the philosophy of and not that of and ), using data from three intercalibrated stations sequentially to form a composite. Data from Helsinki were used for 1846 -1890 (inclusive) and 1893 -1897 and from Eskdalemuir from 1911 to the present day. The gaps are filled using data from the Potsdam (1891 -1892 and 1898 -1907) and the nearby Seddin observatories (1908 -1910) and intercalibration achieved using the Potsdam/Seddin/Niemegk data sequence for 1890 -1931. To remove site effects and the effects of secular drifts in geomagnetic latitudes, the 1/ cos 0.7 (Λ) dependence found by  was shown to apply and was used to make a small (> 5% between 1846 and 2013) correction to the data based on model predictions of the magnetic latitude of the stations, Λ. The (1 ) index extends back to the start of the Helsinki data in mid 1845. One key justification of (1 ) is that it correlates with the IMF as well as (in fact very slightly better than) , despite the fact that it is based on data from just one station (which is Eskdalemuir throughout the space age) rather than the approximately 50 stations contributing to at that time (see Figure 3). One concern, however, is the use of the historic data from Helsinki which is at a higher corrected magnetic latitude and so more subject to auroral current contamination of the kind noted by Svalgaard and Cliver (2010) and Finch et al. (2008), and which could introduce a dependence on solar wind speed, SW . Models of the geomagnetic field give corrected magnetic latitudes of Helsinki varying between 55.5°and 56.5°over the interval that data are used from this station. The survey by Finch et al found that the correlation with IMF began to drop above 60°(and that with SW began to rise). Hence, Helsinki is close to being at too high magnetic latitude. To investigate if this was a problem, Lockwood et al. (2013a) used modern data from the Nurmijärvi station (close to Helsinki) and compared IDV(1d) derived from them to that from Eskdalemuir. The correlation is 0.931 in 27-day means and 0.982 in annual means. Furthermore, the dependence of (1 ) from Nurmijärvi on SW was investigated and the peak correlation found near = 0, very close to the value for Eskdalemuir (see Figure 3). The same tests were applied to modern data from the Niemegk station.
The homogeneous nature of (1 ) is a major advantage when making historic reconstructions of interplanetary parameters because one can have greater confidence that it will have responded to changes in the solar wind before the space age in the same way that it was observed to do during the space age. If an index is not constructed in a homogeneous manner then one cannot have that confidence to the same extent. Hence, for reconstructions of interplanetary parameters, homogeneously constructed indices such as (1 ) and are preferable to inhomogeneous ones such as and .

The m index
The index was introduced by Lockwood et al. (2006b) and used by Rouillard et al. (2007) and Lockwood et al. (2009d). For each station at a given UT, the standard deviation of the hourly means of the horizontal component of the geomagnetic field is computed over a full year, 1 yr . These are then correlated with, and linearly regressed against, the annual means of shown in Figure 5 to yield ′ = × 1 yr + . These normalisations are needed because both the sensitivity and offset for a station were shown to depend on its location and on the UT hour (which, for example, alters the location of the station relative to the midnight-sector auroral oval) (Finch, 2008). Each station-UT is treated as an independent data series. The median of these data series is used as it is less influenced by extreme outliers than the arithmetic mean. Somewhat conservative criteria are used for the inclusion of data, in that annual means of the station-UT time series must correlate reasonably well (correlation coefficient > 0.5) with those of . In addition, does not employ any isolated fragments of data from stations that ceased operating before the start of the space age and only used data from stations that continued to take data into the space age (or there was a nearby station, with which one could make a composite, that did). The advantage of over is that data from all 24 UT-hours are employed, as opposed to just the one (near midnight) value used by . The disadvantage is that its compilation is much more complex and time-consuming than that of and so new data cannot be as readily added. Furthermore, does not correlate as highly with interplanetary parameters as does (see Figure 3). Lockwood et al. (2006b) consider that the index is less reliable before 1902 because then it is based on data from just one station (Potsdam).

The IHV index
The index was devised and introduced by Svalgaard et al. (2003) and Svalgaard and Cliver (2007a) and uses only nightside data to minimise the effect of the diurnal variation.
for a given station is defined as the sum of the absolute values of the difference between hourly means (or spot values) for a specified geomagnetic component from one hour to the next over the 7hour interval around local midnight. The variation with the corrected magnetic latitude shows strong peaks in the auroral oval, indicating it responds most to the variability in the nightside westward auroral electrojet and so it behaves rather like . Because the variation with corrected geomagnetic latitude is flat equatorward of 55°only stations equatorward of this were employed in the global IHV index. The normalisation, grouping and averaging of data from different stations to obtain a global index is described in Svalgaard and Cliver (2007a).

2.2.6
"sigma-H" indices (1 ) and -is based on hourly mean data. at a given station is defined as the value of the standard deviation of the hourly-averaged values at a given UT over a period of days, each single UT-hour being treated separately, as for . There will therefore be 24 values for each period of days at each station. (Note that the index is, using this notation, the median for all available station-UTs of the 366 values for leap years and 365 for all other years). Finch et al. (2008) used = 28 days (close to the solar rotation period, as seen from Earth, which is the Carrington rotation period of 27.2753 days), which gives thirteen 28-day periods per year (with any excess days assigned to the final such period in the year). It is a different measure of the inter-diurnal variation quantified by the , ,

Dependencies of the various indices on interplanetary parameters
In-situ spacecraft data on the near-Earth interplanetary medium became increasingly available from 1963, at the start of the space age. Early studies comparing geomagnetic activity to the near-Earth interplanetary parameters (e.g., Arnoldy, 1971) showed that geomagnetic activity was enhanced when the interplanetary magnetic field (IMF) pointed southward in a reference frame aligned by Earth's magnetic axis: Geocentric Solar Magnetospheric, GSM, is widely used (Russell, 1971;Hapgood, 1992). This had been predicted in the seminal paper by Dungey (1961), who proposed that for this IMF orientation, magnetic reconnection in the dayside magnetopause current sheet would allow the solar wind to drive stronger F-region ionospheric flows (convection) and hence the associated E-region ionospheric currents and geomagnetic activity seen at Earth's surface would also be stronger. The southward IMF orientation in GSM occurs for 50% of the time (Hapgood et al., 1991). The DP2 or "directly driven" currents respond to IMF variations with a lag of a few minutes (Nishida, 1968), whereas the larger DP1 or "storage-release system" currents are enhanced during substorm expansion phases following a lag of typically one hour (e.g., Baker et al., 1981). The high latitude auroral currents link to the magnetospheric ring current via the Region-2 field-aligned currents, as shown in Figure 2. The ring current has long been understood in terms of injection and decay of the trapped particles that carry it (Burton et al., 1975) and the injection is more efficient when the interplanetary magnetic field points southward (see, e.g., Shi et al., 2012). The response is complicated by the fact that the interplanetary electric field also influences the decay of the ring current and there are other, internal magnetospheric factors which influence both the injection and the decay (see reviews by Kozyra and Liemohn, 2003;Pulkkinen, 2007). Enhancements of the ring current cause negative depressions in the index but will also influence other geomagnetic indices. Figure 3 explores the dependence, on annual averaging timescales, of the geomagnetic indices described on Sections 2.1 and 2.2 on the solar wind speed, SW . The correlation between each index and SW is presented where is the IMF field strength and is an exponent that is here varied between -2 and 4. The correlations are for annual means between 1966 and 2012, inclusive. Parameters marked with a prime denote that data have been omitted in computing both sets of annual means if any of the simultaneous (allowing for the predicted satellite-to-Earth solar wind propagation lag) hourly means of , SW or the geomagnetic index are missing due to a data gap. In the case of the 3-hour range indices , and , the procedure adopted by Finch and Lockwood (2007) is followed to ensure only simultaneous geomagnetic and IMF data are included in the annual means. In the case of (1 ), each daily value contains information on from two whole days: in order to be included in the annual means, we here require that there be 75% coverage of the IMF observations over those two days. The value of 75% is chosen as a compromise between not eliminating too much of the data and removing data for which the interplanetary means could be misleading because the data coverage is low. The effects of not carrying out this piecewise removal of data from both sets during datagaps were studied by Finch and Lockwood (2007): effectively one is assuming that annual means are representative, even when large fractions of the data are missing (as they are in some years for the interplanetary data). Even with the piecewise removal of data during data gaps, we here only employ annual means that have data availability exceeding 50% to avoid years of reduced data having undue weight. In the study presented in Figure 3, all the correlations are somewhat improved by taking these steps and, importantly, the of peak correlation is sometimes also affected. Note that only annual mean data for and have been published and the way is generated only yields annual values: as a result, no allowance for gaps in the interplanetary data can be made in these three cases (hence there is no prime symbol attached to , , or in Figure 3). The coupling functions SW have been calculated in hourly data and then averaged, so that ⟨ SW ⟩ 1 yr is used rather than ⟨ ⟩ 1 yr (⟨ SW ⟩ 1 yr ) . The auroral electrojet index (red line) shows peak correlation = 2, i.e., it has a 2 SW dependence. The index (green line) gives a peak at = 1.1 (i.e., it has close to a SW dependence and, hence, varies with the interplanetary electric field). The index shows a peak at = 0.4 (blue line) but some of this dependence on SW arises from the compression of the equatorial field by enhanced solar wind dynamic pressure: if we use only the negative part of ( 1 , which is the same as but treats all intervals where > 0 as data gaps and so only contains intervals when is dominated by ring current effects), we get the dashed blue line with a higher correlation coefficient peak at = 0.1. This peak is flat and, hence, the peak is not significantly different from zero (i.e., the dependence is on alone). The cyan line is for the index and peaks at = 1.9 (very close to the 2 SW dependence of ), the mauve line is for and peaks at = 1.8 and the orange line is for and peaks at = 1.6. The black line is for the (1 ) index, which peaks at near −0.1. Hence (1 ), like and 1 , is not significantly different from having a dependence on only. Thus, as concluded by Svalgaard and Cliver (2010), the negative part of (i.e., ring current enhancement) is closest to explaining the behaviour of the interdiurnal variability indices on these annual timescales. The range indices, and respond in a manner similar to the auroral indices and, in particular, the influence of the westward auroral electrojet on (as monitored by ) can be inferred from the fact that both have a dependence that is not significantly different from 2 SW . The correlation for peaks at a slightly lower than for , , or , which may be a greater influence of the directly-driven currents or may be the effect of the ring current (as both and give peaks at lower ). The index correlation peaks at = 1.9 and so, as Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Variations in the Open Solar Magnetic Flux 15 pointed out by Svalgaard et al. (2003) and Svalgaard and Cliver (2007a), behaves very much like and all the range indices with a 2 SW dependence, as expected because it is a monitor of the nightside auroral electrojet.
The index (yellow line) correlation peaks at = 0.3 and there are a number of possible reasons why this value of exceeds zero. It could be that the response of is set by a mixture of the ring current (with its = 0 dependence) and the DP2 auroral currents (with their = 1 dependence). An alternative explanation is that the normalisation against the index in the derivation of has introduced a small dependence on SW . We also note that employed data from some auroral stations such as Sodankylä, which, as discussed in Section 6, introduces a 2 SW dependence into values.
3 The Long-Term Variability of Geomagnetic Activity The first homogeneous, long-term record of geomagnetic activity was the index compiled by Mayaud, who analysed 100 years' data (1868Mayaud, who analysed 100 years' data ( -1968 from observatories in southern England and near their antipodal locations in Australia (Mayaud, 1971(Mayaud, , 1972(Mayaud, , 1980. In each hemisphere, three different stations were required to make a continuous record and, as for all such composites, intercalibration problems between the different stations arise. Means over calendar years and over 27-day Bartels solar rotation intervals are plotted in the top panel of Figure 4. When was first used to reconstruct the solar magnetic fields, there were several vociferous objections that, despite Mayaud's careful calibration work, the drift seen in Figure 4 was merely an instrumental artefact. There are, indeed, a great many problems that can cause long-term changes in the record from a magnetometer station: in addition to instrument changes, drifts and re-adjustments, a change in the local water table can have an influence, as can the construction of power or railway lines nearby and, on very long timescales, the secular drift in the magnetic poles of the Earth causes the geomagnetic coordinates of a station to drift. The argument that the values could not be as low as derived around 1900 by Mayaud has been proved to be wrong by the recent long and low solar minimum between solar cycles 23 and 24 (Russell et al., 2010;Lockwood, 2010). During this minimum (around 2008), comparably low annual mean was observed, as is shown by Figure 4. Analyses suggesting calibration problems were often based on comparisons with hourly mean data (Svalgaard et al., 2004;Mursula and Martini, 2007). However, it has become clear that hourly mean data are not observing the same mix of currents and phenomena as the range indices (see Section 6) and, hence, many of the differences are real rather than instrumental. This is underlined by the bottom panel of Figure 4, which shows the corresponding means for another homogeneously-constructed long-term index, (1 ). Although many features can be seen in both indices, there are many differences, particularly in the 27-day means. The long-term change is seen in both indices despite 3 major differences between them: (1) they are constructed using data from entirely different observatories, (2) one uses hourly means and the other range data, and (3) the compilation algorithms (including the removal of quite day variations and secular change in station latitudes) are entirely different. The correlation coefficients between and (1 ) for 27-data and annual means (for 1868 -2013) are 0.68 and 0.76, respectively. These correlations should be compared with those over the same interval between the independent indices for the northern and southern hemispheres, and , which are 0.94 for the 27-day means and 0.98 for the annual means.
A number of tests on have been carried out (e.g., Lockwood, 2001;Clilverd et al., 2002;Cliver and Ling, 2002;Lockwood, 2003;Clilverd et al., 2005;Lockwood et al., 2006b;Lu et al., 2012) which show it to be a reasonable indicator of long-term change. Furthermore, studies of potential factors identified a solar origin of the long-term drift (Clilverd et al., 1998;Stamper et al., 1999;Clilverd et al., 2002). However, there is also evidence of some error in the index, as stored in  most data centres at the present time. The first authors to suggest errors in were Svalgaard et al. (2004) who compared against the index: indeed they argued that all centennial change in was erroneous. Comparing with is a valid test of because, as shown in Figure 3, they both correlate best with SW for near 2. The initial comparisons by Svalgaard et al. (2004) found an almost negligible change in since 1900 which would imply early values were too low: quantitatively they found the mean error in was 8.1 nT over solar cycle 14 (1901 -1912, inclusive), which considering that the mean over this cycle was lower than the mean for cycles 20, 21, and 22 by the same amount, means that they argued that all the long-term change in was erroneous. However, this early version of was based on just one composite data series from two very nearby stations, Cheltenham and Fredricksburg (intercalibrated using the available 0.75 yr of overlapping data in 1956). Using more stations, Mursula et al. (2004) found there was upward drift in values over the 20th century, but it depended on the station studied; nevertheless they inferred that the drift in was too large. As a result, Svalgaard et al. (2003) revised their estimates, using several stations, such that the cycle-14 mean of was too low by 5.2 nT (this would mean that 64% of the drift in was erroneous). However, Mursula and Martini (2006) showed that about half of this difference was actually in the estimates not and was caused by the use of spot values rather than hourly means in constructing the early data. This was corrected by Svalgaard and Cliver (2007a) who revised their estimate of the difference further downward to 3 nT. These authors also showed that most of the difference arose in a 6-year interval around 1957, which is the time of the move of the northern hemisphere station from Abinger to Hartland. Independently, Lockwood et al. (2006b) carried out tests of using the range index which has been constructed since 1936 from 11 -13 northern hemisphere stations, and range indices from a number of other stations (thereby ensuring that they were comparing like-with-like). They also found a step-like change around 1957 and estimated it to be about 2 nT in magnitude. Because 1957 was only 11 years before the end of the data series available to Mayaud and because in that time solar cycle 20 was rather unusual, this discontinuity in was not as apparent in the original data as it is now. Other studies also indicate that needs adjusting by about 2 nT at this date (Jarvis, 2004;Martini et al., 2012). The 2 nT discontinuity estimate corresponds to an error in the drift in between cycle 14 and the space age of about 25%. Lockwood et al. (2006b) implemented revised calibrations between stations (the largest change needed being for 1957) and, hence, derived a revised index series . Figure 5 shows that the difference between annual means of and is generally less than 2 nT, which is considerably smaller than the range of the long-term drift in annual means over the last 150 years (approximately 12 nT at sunspot minimum and 16 nT at sunspot maximum).
Many historic datasets exist in the form of hourly mean data (or in the case of some of the earliest data, spot values within the hour) and these have recently also been used to generate indices. Until recently many were in the form of paper records in observatory yearbooks. However, in recent years many have been digitised making a valuable new extra resource for reconstruction work. Figure 6 shoes the variation of the "median index", (Lockwood et al., 2006b). The construction of this index recognises that the response to global geomagnetic activity at a given observatory depends upon its magnetic local time (MLT) and, hence, on the Universal Time (UT). However, the station gives information at all UT and so rather than discard data from all but one MLT, the index treats each station-UT as a separate data series. To avoid outliers having a disproportionate effect, is defined as the median of all the normalised annual values for the different station-UT combinations. The black line in the upper panel of Figure 6 is , which also shows a similar long-term variation to the annual means of and (1 ) shown in Figure 4. Figure 7 shows the index compiled by Svalgaard and Cliver (2010). These authors take the series back to 1835, just 3 years after the establishment of the first magnetic observatory in Göttingen. This is done using a linear correlation between proper and the index. However, it must be remembered that is not an inter-diurnal variability index before 1872 and Bartels did not regard all the data before this date as reliable. Year number of station−UTs number with r > 0.5 Figure 6: The median index, . For each station at a given UT, the standard deviation of the hourly means of the horizontal component of the geomagnetic field is computed over a full year, 1 yr . These are then correlated with, and linearly regressed against, the annual means of aa shown in Figure 4 to yield ′ = × 1 yr + . These normalisations are needed because both the sensitivity and offset for a station have been shown to depend on its location and on the UT hour (which, for example, alters the location of the station relative to the midnight-sector auroral oval) (Finch, 2008). Each station-UT is treated as an independent data series. The grey lines show the variations of the ′ values for all station-UTs for which the correlation coefficient exceeds 0.5 and is significant at the 2 level. The number of station-UTs meeting this criterion is shown as a function of time by the black histogram in the lower panel. The black line in the upper panel is the median of all the available data for each year and is called the "median index", . Image reproduced from Lockwood et al. (2006b).
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4  Svalgaard and Cliver (2010). The grey curves are the variations for individual stations. The red curve is the index, defined as the arithmetic mean of the median and average values of the individual station values. A few station values were very large outliers of the distribution at any one time and those that were more than five standard deviations from the average were omitted in calculating the value for that year. The number of contributing stations, , is shown by the thin blue curve. The dashed blue line is the corresponding number of stations used by . Bartels' index is considered a single station and gives the dotted line extension to before 1871 using a linear regression of 1871 -1930 data with the index proper. Image reproduced by permission from Svalgaard and Cliver (2010), copyright by AGU.
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Note that for both and , the number of stations used decreases as one goes back in time, which contrasts with Mayaud's philosophy for which was to derive a homogeneously-constructed data series. Given the potential for site-dependent errors and drifts, these indices therefore become increasingly unreliable as one goes further back in time. Svalgaard and Cliver (2010) state that "only a few (good) stations are needed for a robust determination of ". This is indeed a valid statement: for example, (1 ) (shown in the bottom panel Figure 4) is based on just one station at any one time, yet for 1880 -2013 it gives a correlation coefficient of 0.96 with (Lockwood et al., 2013a). However, before 1880 the correlation is considerably lower. Svalgaard and Cliver (2010) note that they had to discard some data because they were more than 5 from the mean. This poses a dilemma if there are too few stations to define the distribution: in such cases these outliers could not be identified and one would have used them, not knowing they were in error. In other words, without sufficient other stations to compare with, one is not able to say which the "good" stations are. It therefore is inevitable that the inhomogeneous data series such as and are less reliable further back in history. Potential causes of additional uncertainty in early data are: (1) there were fewer stations; (2) measurement techniques and equipment improved with time; (3) the realisation of that urban environments were generating magnetic noise problems forced moves to quieter observing sites; and (4) earlier data tend to be spot values rather than hourly means. On the last point, Svalgaard and Cliver (2010) could find no discontinuities in the data series from individual stations (unlike values) when they changed from supplying spot values to hourly means. Nevertheless, it is self-evidently true that hourly means are preferable to spot values, particularly if a site is suffering from any intermittent noise problems and/or if the instrument stability is poorer.
Figures 4, 6, and 7 all show similar long-term variations, despite the fact that the indices presented differ in almost every facet of their compilation. There are, however, important differences that are discussed in Section 6. These data from geomagnetic observatories give an invaluable resource for studying solar-terrestrial physics and solar variability in the 181 years since Gauss' first observatory was established in Göttingen. In particular, we can study the variations in the solar corona and interplanetary medium that accompany the long-term sunspot variations identified by Gleissberg (1944). Feynman and Crooker (1978) studied the implications of the drift in the index and concluded that either the solar wind speed or the IMF had changed over the past century. The first paper to separate these two influences (using the recurrence index of Sargent, 1986, to quantify solar wind speed), thereby showing that the main change was in the magnetic field, was by Lockwood et al. (1999a). These authors used to reconstruct the unsigned open solar flux, which is the total magnetic flux leaving the top of the solar corona and entering the heliosphere. Other solar terrestrial phenomena, such as lower latitude auroras, were found to reveal the same long term changes as and the derived open solar flux (for example, Pulkkinen et al., 2001).
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Variations in the Open Solar Magnetic Flux 21 4 A Note on the Importance of Understanding the Provenance of Geomagnetic Data Before continuing, there is an important point that needs to be made about correcting and homogenising historic data. There is the potential to do much more harm than good, if corrections that are based on inadequate understanding (or, worse still, postulated theories) are allowed to modify a dataset but clear metadata and the means to reverse the changes, at any stage in the future, are not retained and made readily available. The full provenance of any one dataset is easily lost and without it such a change could be a massively retrograde step. I therefore strongly recommend that historic datasets that are re-processed should be re-named so they can be recognised for what they are and the original dataset must be retained. Hence, although at the present time it is reasonable, for example, to regard as a corrected form of , should something in the revised inter-calibrations in future prove to be invalid or inadequate, then scientists can readily return to the original data. For this reason, Lockwood et al. (2006b) treated as a different index to and gave it a new name.
A good example of the sort of problems that can arise is provided by the hourly mean data from the Eskdalemuir station. This observatory has operated continuously since 1911, when it was established by Kew observatory on a rural and exceptionally clean magnetic site when the Kew site was rendered too noisy by the introduction of trams into west London (Harrison, 2004). There was a discontinuity at 1932 in the commonly-used set of hourly mean data from this station, which had remained un-noticed until 2004, when Mursula et al. (2004) and Clilverd et al. (2005) analyzed the inter-hour variability of Eskdalemuir data and found very small values in the early part of the 20th century. Detective work by Leif Svalgaard established that prior to 1932 the data stored in the Word Data Centre (WDC) system were 2-hour running means of the data recorded in the observatory yearbook. Such smoothing greatly influences inter-hour indices. MacMillan and Clarke (2011) have confirmed that this was indeed the case and digitised the data from the yearbook, so that all data from Eskdalemuir now available from WDC-C1 are hourly means with no running mean smoothing applied. (Users should check which dataset they are using because one problem with data that has been corrupted or massaged is that it is very hard to expunge from all datasets and bad data tends to resurface). It is not known how, when, where, or why this post-processing was carried out because the available metadata did not tell us the full provenance of the data. Presumably somebody, somewhere had believed that the noise suppression obtained by implementing a running mean was a good thing. If one used daily means of the (supposed) hourly data there would have be a some effect (as an hour of data from both the day before and the day after would be averaged in with half weight), but it would be small and the effect would be negligible on annual means. It is fair to assume that whoever implemented the smoothing never envisaged the use of the data to generate an inter-hour variability index. This example illustrates very graphically the great importance of knowing, as far as is possible, the true provenance of historic data and of all the corrections and changes that may have subsequently been applied to them. Lockwood et al. (2013a) have revealed a similar issue with data from Ekaterinburg by implementing an inter-correlation of hourly mean data from a given station at different UTs as a check of data consistency: they found very high correlations around 1900, revealing that interpolation to hourly values from more sparse data had taken place. This is a vitally important concern for reconstruction work: being overly ready to accept an adjustment is highly irresponsible as it could deny future generations of scientists the opportunity to properly exploit the data or, in a worst case scenario, seriously mislead them (Council of AGU, 2009;Vogel, 1998).
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 5 Solar Wind Coupling Functions, the Importance of Averaging and Allowance for Data Gaps As discussed in Section 2.3, geomagnetic activity is enhanced when the northward component of the interplanetary magnetic field (IMF), in the GSM frame of reference, , is increasingly negative. As a result, a half-wave rectified form of is often used to predict geomagnetic activity, such as , where = − when ≤ 0 and = 0 when > 0. Because is discontinuous in slope around = 0, a form such as sin 4 ( /2) is often preferred, where the IMF "clock angle" = tan −1 ( / ), being the dawn-to-dusk component of the IMF in GSM. This has a very similar form to at large | | but is continuous in slope around zero. The power density in the solar wind at Earth is dominated by the kinetic energy of the bulk flow of the particles and so the square of the solar wind velocity, SW , is another important factor. Hence a simple "coupling function", designed to quantify the effect of the solar wind on geomagnetic activity, is × 2 SW . A great many such coupling functions have been proposed and tested. One widely-used example is the "epsilon parameter", = (4 / ) SW 2 sin 4 ( /2) 2 where is the magnetic permeability of free space and is a scaling factor that allows for the cross-sectional area of the geomagnetic field presented to the solar wind. In practice this area reduces with increased solar wind dynamic pressure ( SW = SW SW 2 SW , where SW is the mean solar wind ion mass and SW is the number density of solar wind ions) but a constant value of = 7 , where is a mean Earth radius, is often used. However is based on the energy density in the solar wind hitting the Earth's space environment being in the form of Poynting flux, which is not correct because by far the largest energy density in the undisturbed solar wind is in the form of the ions' bulk-flow kinetic energy (which is converted into Poynting flux by the currents that flow in Earth's bow shock and magnetopause, see Cowley, 1991;. A correct version was provided by Vasyluinas et al. (1982) who applied dimensional analysis as well as the energy flow equations and used pressure balance on a hemispherical dayside magnetopause to compute . From this they derived the coupling function , which is the power coupled into the magnetosphere. It is the product of the power density in the solar wind, times the cross sectional area of the magnetosphere presented to the solar wind, times the fraction of the incident power that crosses the magnetopause, (1) From pressure balance at the nose of the magnetosphere, and assuming the dayside magnetosphere is hemispherical in shape we have where is the Earth's magnetic moment and 1 is the blunt-nose shape factor for flow around the magnetosphere. Vasyluinas et al. (1982) noted that the transfer function must be dimensionless and proposed a form where is a free fit parameter which arises from the unknown dependence of the coupling on the solar wind Alfvén Mach number, A . Combining Equations (1), (2), and (3) yields Finch and Lockwood (2007)  (green), (red), 2 SW (olive), SW (magenta), and (black). Image reproduced by permission from Finch and Lockwood (2007), copyright by EGU.

Mike Lockwood
A factor that should not be neglected is that although the geomagnetic data are essentially continuous, the same is far from true of the interplanetary data. Since 1995 the WIND and the ACE spacecraft have provided almost 100% coverage, but before then coverage had sometimes been lower than 50% in any one year. Finch and Lockwood (2007) showed that ignoring these data gaps has a considerable effect at given averaging timescale T and can even change which coupling function performs best. Hence before making the correlation, Finch and Lockwood (2007) piece-wise removed both interplanetary and geomagnetic data for which there was a gap in the interplanetary data of duration one hour or greater during a 3-hour geomagnetic data interval (allowing for the predicted propagation lag between the interplanetary monitoring spacecraft and the dayside magnetopause). Figure 8 shows that for the full range of T , (as given by Equation (4) and shown in dark blue) performs best for the range index , although for > 27 days the much simpler function 2 SW (light blue) gives correlations as high (or even slightly higher). The parameter (red) performs as well as 2 SW (but less well than and 2 SW ) at low T and is considerably poorer at high T , reflecting its nonphysical basis. For the case of shown, functions that do not combine both IMF and solar wind speed ( 2 SW in olive, SW in magenta, and in black) do not perform as well as those that do. There has been much discussion about the precise form of coupling function that performs best, but these discussions almost invariably neglect the facts that this conclusion depends on T and on which activity index is considered. The importance of this is discussed in Section 6. Note that in Figure 8 the high-performing indices and 2 SW reach correlation coefficients near 0.97 at T = 1 yr and that these do not fluctuate with the precise T value used to anything like the same extent as do the others. , from the OMNI2 composite dataset. The left-hand plot shows the temporal variations of ⟨ ⟩ for 1966 -2013 at different averaging timescales, T . The right-hand plot shows the corresponding probability density functions over the same interval. Note the difference in the vertical scales. In both panels light blue is for T = 1 h, dark blue for T = 1 day, red for T = 27 days and black for T = 1 yr.
One noticeable feature of Figure 8 is that for large T , 2 SW performs as well as which is interesting because it does not contain the sin 4 ( /2) IMF orientation factor that equation 4 shows is part of . Figures 9 and 10 explain why this is the case. The left-hand panel of Figure 9 shows the northward component of the IMF in the GSM reference frame, between 1966 and 2012 (inclusive). The different colours identify the timescale T on which the data are averaged before Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 they are plotted. For T = 1 h (light blue), values vary between -30 nT and +30 nT (off scale), and periods of smaller and larger excursions, both positive and negative, are seen (corresponding to the peaks and minima of the solar cycle). The same is true for T = 1 day (dark blue) but the range of variation is reduced as intervals of opposite cancel to a great extent for the larger T . For T = 27 days (in red) the fluctuation level is very small and it has almost disappeared for T = 1 yr (black). The distributions of values over the interval are shown for each case in the right-hand plot. It can be seen that the averaging almost completely removes the orientation factor such that tends to zero for large T . The effect of averaging is demonstrated by Figure 10 which shows scatter plots of sin 4 ( /2) as a function of the IMF magnitude for the 1966 -2012 data. Part (a) is for hourly observations. It can be seen that there is large scatter between sin 4 ( /2) = (when the field points directly southward so sin 4 ( /2) = 1) and sin 4 ( /2) = 0 (when the field points directly northward so sin 4 ( /2) = 0). Part (b) is the same for 1-year averages. In this case there is a good linear relationship, with some scatter. Hence, on timescales of T = 1 yr, the IMF orientation factor is averaged out and the average southward IMF component and, hence, the level of geomagnetic activity, is proportional to , as first noted by Stamper et al. (1999). This is the basic reason that we are able to make deductions about from geomagnetic activity when averaging is done on annual timescales. There is some information that could be extracted at higher time resolution, but Figure 9 shows that even for T = 27 days scatter will be introduced because this T is not sufficient to average out as much of the IMF orientation factor and at yet smaller T this scatter would render the results completely meaningless. In this review we restrict our attention to using T = 1 yr, for which the orientation factor is almost completely averaged out and for which correlations between the better coupling functions and geomagnetic activity of 0.97 can be obtained (as shown by Figure 8). Part (c) of Figure 10 shows the distribution of annual values of the ratio /[ sin 4 ( /2)]. The mean value of this distribution is 3.251 and the standard deviation is 0.369. This distribution will be used in Section 9.4 in a quantitative analysis of the uncertainties in reconstructions. In addition, in order to derive the open solar flux, the modulus of the radial component of the IMF, away from the Sun ( , which is the same as − in the GSM frame) is usually used (see Section 7.3). | | can be obtained from , again because of the effect of averaging. Because in Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 one year roughly as much "Toward" IMF ( < 0) flux will be seen as "Away" ( > 0) flux, ⟨ ⟩ will tend to zero when averaged over T = 1 yr and so | |, or some equivalent which does not cancel Toward and Away flux (see Section 7.5), is needed. The orientation angles of the IMF, both the clock angle and the garden-hose angle = tan −1 ( / ), vary considerably on short time scales. Figure 9 shows that the long-term average of is 90°and that of is given by Parker spiral theory (Parker, 1958(Parker, , 1963 which predicts for near-Earth interplanetary space where 1 is the mean Earth-Sun distance ( 1 = 1 Astronomical Unit, AU), and is the angular rotation velocity of the solar atmosphere with respect to the fixed stars. Equation (5) shows that the ratio | |/ can be predicted for a given SW . The left hand panel of Figure 11 shows a scatter plot of the values of | |/ predicted by Equation (5) against the observed values for hourly means (T = 1 h). It can be seen that the large variations of and on hourly timescales mean that there is no relationship between the observed and predicted values. On the other hand, the right hand plot shows the scatter plot for annual means. (Note that the modulus of has here been taken of means over 1 day, i.e., ⟨| | 1 d ⟩ 1 yr is used). For this timescale there is a linear relationship between the observed and predicted values. The observed values are lower than the predicted ones because of the use of T = 1 day in taking the modulus which means there is some cancellation of Toward and Away field: this issue is discussed further in Section 7.5. Figure 11 demonstrates that Parker spiral theory can be used to predict | | from if the solar wind speed SW is known and an annual averaging timescale is used. Figure 11: Scatter plots of the predicted ratio | |/ from Parker spiral theory, where is the IMF magnitude and | | is the modulus of its radial component, as a function of the observed value of that ratio. The left-hand plot is for hourly observations, the right-hand plot for 1-year averages. In the case of the annual means, the modulus of daily means of observations, i.e., ⟨| | 1 d ⟩1 yr, is used.
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Variations in the Open Solar Magnetic Flux 27 6 Differences Between Range and Hourly Mean Geomagnetic Data and the Effect of Solar Wind Speed Figure 3 shows that there is a consistent difference between interdiurnal variation indices and the range indices. The interdiurnal variation indices, such as , (1 ), and all correlate best with the IMF field strength on annual averaging timescales whereas mid-latitude range indices ( , , , -and its northern and southern hemisphere components and ) all correlate best with a coupling function close to 2 . We should not expect these two classes of indices to behave in the same way. Consider a "steady convection event" (see, for example, Lockwood et al., 2009a, and references therein) lasting, for example, 24 hours in which DP2 is enhanced but DP1 is not because of the lack of substorms. There has been debate about whether or not the ring current is enhanced during such events (Pulkkinen, 2007), the outcome of which appears to be that although ring current enhancements are weaker because of the lack of substorms, they are still present (for example Zhou et al., 2003). Given that interdiurnal variability indices appear to be particularly sensitive to the ring current and/or the DP2 system, steady convection events will influence these indices much more than range indices (which respond most strongly to the DP1 currents). On the other hand, because the substorm cycle of energy storage in the magnetospheric tail (associated with DP2) and its explosive release (associated with DP1) generally takes place within 3-hour intervals, we would expect to see strong signatures of substorms in the range indices, but weaker ones in interdiurnal range indices. This line of argument suggests that the ratio of sensitivities to DP1 and DP2/ may be greater for range indices than for interdiurnal variability indices. In this section, we will discuss evidence that this is the case and show that it causes them to have different responses to variations in the interplanetary medium, such that the optimum coupling functions are not the same.
The paper by Finch et al. (2008) provides a very important insight. These authors devised the "sigma-H" indices for each station, based on hourly mean data (see Section 2.2.6). The series of ⟨ 28 ⟩ 1 yr values for each station were correlated with three interplanetary parameters: , SW , and 2 SW . The zero-lag correlation coefficients are plotted in Figure 12 as a function of the modulus of the station's invariant geomagnetic latitude, |Λ|. The top panel of Figure 12 shows that the correlation coefficient for is very high (∼ 0.9, significant at the 2 level) except at auroral latitudes (between the two vertical dashed lines) where it falls to a minimum of about 0.6. On the other hand, the correlation with SW (middle panel) is low outside the auroral oval (generally below about 0.4), but rises to a peak of about 0.85 within the oval. The bottom panel shows that 2 SW is high and ≈ 0.8 outside the oval and ≈ 0.9 within it. Thus the influence of SW arises in the auroral oval, making 2 SW correlate best there but elsewhere alone provides the highest correlation. Figure 13 shows the MLT at which the peak correlations within the auroral oval occur. The primary source of the correlation with SW occurs in the midnight MLT sector. This is when and where westward electrojet of the DP1 system is most likely to be detected (Tomita et al., 2011). The maximum correlation with can occur at any time except in the midnight sector. Therefore, it is clear that the correlation with the solar wind velocity at yearly time scales is linked to the storage-release system of the magnetotail and the westward auroral electrojet of the DP1 current system. At all other locations the correlation is better with and appears to be more associated with the directly-driven DP2 currents and/or with the ring current.
This finding makes good sense, physically. The westward auroral electrojet of the DP1 current system is part of the "substorm current wedge" (see Figure 14), in which the dawn-to-dusk current in the near-Earth edge of the magnetospheric cross-tail current sheet is diverted during substorm expansion through the midnight-sector auroral oval (McPherron et al., 1973). In this schematic, (a) and (b) are for a substorm growth phase, (c) and (d) are for a substorm expansion phase. The undisturbed IMF, is draped over the nose of the magnetosphere by the slowing effect of the Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4  is the undisturbed IMF and BS is the current in the Bow Shock. In all panels magnetic reconnection is taking place at MP in the dayside magnetopause. The tail lobe field is TL and the current in the cross-tail current sheet ( CT) is disrupted in the grey areas in (c) and (d). The mauve currents in (c) are the "substorm current wedge" within which the magnetospheric field lines "dipolarise" at onset (the dashed line in (d) shows the stretched field line before onset and the arrow the associated sunward convection surge of the frozen-in plasma). The "near-Earth neutral line", TL, is also shown in (d).
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 bow shock (currents BS ). The IMF in this schematic points due south in the GSM frame which favours magnetic reconnection at MP , driven by the large magnetic shear across the dayside magnetopause current sheet. This generates open magnetospheric flux that is appended to the tail by the solar wind flow, causing rises in the tail lobe field TL and in the current in the near-Earth cross-tail current sheet ( CT ) during the growth phase. This rise comes to an end at the onset of the expansion phase, when the Earthward edge of the cross-tail current is disrupted in the area shown in grey in (c) and (d). This current is diverted down post-midnight field lines, along the westward electrojet in the ionosphere and back up pre-midnight field lines. This path is shown in mauve in (c) and is called the "substorm current wedge" which gives the magnetic disturbances classed as DP1. Within the wedge the field dipolarises as shown in (d) (the dashed line being the stretched field line before onset and the white arrow the sunward convection surge of the frozen-in plasma). Part (d) also shows open flux now being destroyed by reconnection at a "near-Earth neutral line", TL . The DP2 currents that dominate during the growth phase are driven by the reconnection in the dayside magnetopause and so depend strongly on the southward IMF. In comparison, the cross-tail current diverted into the auroral electrojet (to give the DP1 disturbances) is set by TL which depends on both the total open magnetospheric flux, (and, hence, also on the IMF) and the solar wind dynamic pressure (∝ 2 SW ) which squeezes the tail where the current disruption takes place. This is because the cross-tail current disruption occurs close to the Earth where the magnetotail is still flaring (i.e., its radius is increasing with distance away from Earth) which means that enhanced dynamic pressure SW caused by enhanced solar wind velocity SW can squeeze the tail at this location and so increase the field in both tail lobes, giving a higher magnetic shear across the cross-tail current sheet for a given amount of open magnetospheric flux in the tail lobes. Indeed, that there must be this 2 SW dependence in substorm-related phenomena can be seen by considering what happens further down the tail, at greater negative coordinates. Here the tail reaches its maximum, asymptotic radius so is no longer flaring, so the magnetopause becomes parallel to the solar wind flow and SW has no influence. The magnetopause location is here set by pressure balance between the magnetic pressure that dominates in the tail lobes, 2 TL /(2 ), and the static pressure of the solar wind, dominated by the thermal pressure of the particles, ( SW SW , where SW is the solar wind temperature). Substorms occur because of the growth of open flux in the tail. Because in the far tail, TL is set by pressure balance with the static pressure in the interplanetary medium, adding more open flux causes the far tail radius to increase, but TL and CT remain constant. Only closer to the Earth, where the tail is flaring, does the solar wind dynamic pressure act to constrain the tail radius so that the accumulation of open magnetospheric flux there causes a rise in TL and CT . Hence, substorm phenomena such as the current disruption and the formation of the near-Earth neutral line must occur relatively close to the Earth and must have a dependence on solar wind dynamic pressure (as inferred from observations by Karlsson et al., 2000).
The conclusion from the above considerations is that the current available to be diverted when the current wedge forms is higher when SW is high. The net result is that the DP1 currents have a dependence on 2 SW . Recalling the conclusion, discussed earlier in this section, that range indices are likely to be influenced by the DP1 currents to a much greater extent than interdiurnal variation indices, we would expect the optimum coupling function to have a dependence on 2 SW for range indices (and for ) that is not seen for interdiurnal variation indices such as and (1 ). That this is indeed the case for and was first noted and exploited by Svalgaard et al. (2003) and has subsequently been used by Svalgaard and Cliver (2007a), Rouillard et al. (2007), and Lockwood et al. (2009d). The difference for and is demonstrated by Figure 15. Lockwood et al. (2009d) investigated the correlations of both and with the general form SW and the exponents and giving peak correlation were quantified using the Nelder-Mead simplex search method (Nelder and Mead, 1965). Results were almost identical if the minimum Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Variations in the Open Solar Magnetic Flux 31 r.m.s. fit residual was searched for. In both cases, was extremely close to unity (as used in Figure 3) and was found to be 2 for and 0.3 for . Thus, as predicted above, is more dependent on SW than . Figure 16 underlines that the difference in the exponent for the two cases is significant. If the exponent were to be the same in the two cases its optimum value would be about = 1.6 for which the combined significance of (1 − ) × (1 − ) peaks at 13%, so the difference between the exponents in the two cases is significant at the 87% level. Plots equivalent to Figure 15 and Figure 16 reveal this difference is even more significant for the combinations and , and , and (1 ) and . This difference between the dependence of the two geomagnetic indices on SW is extremely useful. Using the best-fit regressions such as those shown in Figure 15, the variations of both SW . The best-fit regression lines are derived using the Bayesian least-squares regression fit procedure described by Rouillard et al. (2007). The correlation for with 2 SW is = 0.97 and using the autoregressive AR-1 red noise model this correlation is found to be significant at greater than the 10 −5 level. The correlation of with The open solar flux, , is the magnetic flux leaving the top of the solar atmosphere and entering the heliosphere. A notional surface, the "coronal source surface" is envisaged at the top of the corona, which is everywhere perpendicular to the field, and is the flux threading that surface and so is also called the "coronal source flux". This review (unless otherwise stated) considers the "signed" open flux which means that only the flux of one polarity (inward or outward) is quantified. If we assume that Maxwell's equations hold such that there are not significant numbers of magnetic monopoles inside the coronal source surface, then the inward and outward fluxes are equal and the "unsigned" open flux is simply twice the signed value. The source surface is usually taken to be a heliocentric sphere of radius 2.5 ⊙ , where ⊙ is a mean solar radius (Arge and Pizzo, 2000).

The potential field source surface (PFSS) method
Prior to the Ulysses mission, could only be evaluated from solar magnetograms using "potential field source surface" (PFSS) modelling (Schatten et al., 1969;Altschuler and Newkirk, 1969;Schatten, 1999). In this method, photospheric magnetic fields observed by a magnetograph are mapped through the solar corona to the source surface with a number of assumptions. This involves solving Laplace's equation within an annular volume above the photosphere in terms of a spherical harmonic expansion, the coefficients of which are derived from Carrington maps of the photospheric magnetic field (i.e., maps assembled over an entire solar rotation from magnetograms Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Variations in the Open Solar Magnetic Flux 33 recorded by magnetographs on Earth's surface or on board a spacecraft in orbit around the L1 Lagrange point). Two major assumptions are that there are no temporal variations within the 27 days taken to build up the map and that there are no currents in the corona (these are neglected so as to allow unique solutions in closed form). To eliminate the possibility that such simple harmonic expansions would result in all of the magnetic field lines returning to the Sun within a small heliospheric distance, the coronal field was required to become radial at the outer boundary, the source surface. Despite its many assumptions and obvious limitations, PFSS has been very successful in the study of a wide range of solar and heliospheric phenomena, including: coronal structure as seen during eclipses (e.g., Smith and Schatten, 1970), end-to-end modelling of Earth-impacting coronal mass ejections (CMEs, e.g., Luhmann et al., 2004), coronal null points and CME release (e.g., Cook et al., 2009), interplanetary magnetic fields (e.g., Burlaga et al., 1978), heliospheric current sheet structure (e.g., Hoeksema et al., 1982), waves in the corona (e.g., Uchida et al., 1973), solar wind acceleration (e.g., Neugebauer et al., 1998;Marsch, 1999), stellar coronal fields (e.g., Jardine et al., 2002), coronal hole and fast solar wind stream evolution (e.g., Wang and Sheeley Jr, 1990), co-rotating interaction regions and associated cosmic ray modulation (e.g., Rouillard et al., 2007), solar wind speed prediction (e.g., Arge et al., 2002), solar wind density structure (e.g., Rouillard et al., 2010), pseudostreamers (e.g., , and quantifying the open solar flux (discussed below). The method has also generated results that compare well with images that reveal field line structure in the corona and with the results of MHD modelling (see the Living Review by Mackay and Yeates, 2012).

Ulysses observations
The Ulysses spacecraft is the first to carry out a comprehensive survey the magnetic field in heliosphere outside the ecliptic plane: its orbit covers a wide range of heliospheric distances, and an almost full range of heliographic latitudes, Λ. This mission has generated a vitally important result in that it found that the average radial field of the heliospheric field was independent of Λ. This result was first found to apply as the satellite passed from the ecliptic plane to over the southern solar pole Balogh et al., 1995). Subsequently, the result was confirmed by the pole-to-pole "fast latitude scans" during the perihelion passes of the spacecraft Smith et al., 2001;. As pointed out by Smith (2011Smith ( , 2013, this "Ulysses result" was initially derived by averaging the radial field over the inferred Toward and Away sectors of the heliospheric field; however, Lockwood et al. (1999b), , , and Lockwood et al. (2009b) have shown that the invariance with Λ was also found if the modulus of the radial field was employed over fixed averaging intervals. The differences between, and complementarity of, these two approaches are discussed in Section 7.5. Figure 17 illustrates the Ulysses result by showing data from the third perihelion pass of Ulysses and compares them to simultaneous data from the ACE spacecraft. The top panel clearly defines the two large polar coronal holes seen by Ulysses (where SW is large), separated by a single streamer belt around perihelion (minimum , see the black line in the bottom panel) where SW is lower. This latitudinal structure is characteristic of the sunspot minimum conditions prevailing during this pass and Ulysses passed from 80°in the south to 80°in the north (black line, panel f). Note that perihelion (minimum , panel g) is within the streamer belt at ≈ 1.4 AU > 1 = 1 AU, so when both are in the streamer belt, Ulysses is at greater than ACE by about 0.4 AU. The panel b emphasises the variability of SW is greater within the streamer belt. Panel c shows the dominant field orientation changes from away to toward as the streamer belt is crossed and panel d shows the modulus of the radial field seen by the two craft. Note that in this plot, the influence of the timeconstant T (on which the modulus is taken) has been eliminated by using a range of T down to 1 second and fitting the results so the asymptotic value at → 0 can be calculated (see Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Figure 2 of Lockwood et al., 2009b): this means that there is no cancellation of toward and away field in these values. The data have been normalised to = 1 using an -squared dependence (i.e., the Ulysses values have been multiplied by ( / 1 ) 2 ). It can be seen from Figure 17(d) that Ulysses observed almost the same range-normalised radial field as seen simultaneously by the ACE craft: there is a weak downward trend in both, owing to the pass taking place while the solar cycle was still declining slightly.
The first and third Ulysses perihelion passes were at sunspot minimum, but the second was close to sunspot maximum. That the Ulysses result applied in all three (Lockwood et al., , 2009b is an important demonstration of its generality. It is explained by the low plasma beta of the solar wind just after leaving the coronal source surface, i.e., the total magnetoplasma pressure is dominated by the magnetic pressure. This results in slightly non-radial flow close to the Sun, which smoothes out differences in the tangential pressure and, as this is dominated by | | 2 /(2 ), this renders the magnitude of the radial magnetic field, | | constant in latitude Suess et al., , 1998. It means that the signed (of one radial field polarity) open solar flux threading a heliocentric sphere of radius , ( ) can be computed using Averaging over 27 days ensures that any variations with Carrington longitude are averaged out.

Excess flux
In Figure 17 there is a slight difference between the derived flux threading the sphere on which ACE sits and that threading the sphere on which Ulysses sits and this is shown by the grey area in panel (e). Lockwood et al. (2009b,c) have termed this difference the "excess flux", defined for general as: Note that in Figure 17, the timescale on which the modulus is taken is → 0. From comparison of Figures 17(e) and 17(a), it can be seen that Δ is larger in the streamer belt than within the polar coronal holes. Owens et al. (2008a) surveyed Carrington-rotation means of magnetic field data recorded in the heliosphere (from 14 different spacecraft in different orbits) and compared them to the data taken simultaneously in near-Earth space. In their survey the modulus of was taken of averages over T = 1 h intervals (i.e., | | 1 h ). The Δ values, computed using Equation (7) showed no variation with heliographic latitude, Λ (as expected from the Ulysses result) but did increase with heliocentric distance, , as shown in Figure 18. This rise means that there are two components of the flux values computed using Equation (6): the coronal source flux, (which is the value of ( ) for = 2.5 ⊙ ) and a component which arises in the heliosphere (and grows with ). Note that Figure 18 shows that this rise is present at < 1 as well as > 1 , such that Δ is negative at < 1 for the definition given in Equation (7). This increase can arise from a number of physical phenomena, including Alfvén wave growth, transients such as CMEs, kinematic effects of solar wind flow structure on the frozen-in field, and the outward propagating structures, such as plasmoids and folded flux tubes, generated by near-Sun reconnection of flux. All these effects can give Toward and Away field structure in the heliosphere within a single coronal source sector (i.e., a region of constant unipolar radial field polarity at the source surface): and this heliosphericallyimposed toward-and-away structure at general has a range of characteristic spatial scales such that it will only partially be averaged out by taking the modulus of a mean over a given timescale T (Lockwood et al., 2006a). Thus, Δ varies with T and although most Alfvén waves are averaged out at relatively small T , the other phenomena are not   Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 is that using too large a T results in toward-and-away field that does map back to the coronal source structure also being cancelled. (Taken to its extreme, use of T = 27 days would result in ( ) ≈ 0 as toward and away sector flux cancel). There is no T at which the heliospheric effects are completely averaged out but none of the source sector structure is averaged out and so using a fixed T to eliminate this effect must be a form of compromise. Lockwood et al. (2009c) devised a method to use the measurements of the tangential field and flow to map from a general to 1 and thereby evaluate how much radial field structure has been amplified by longitudinal flow structure in the heliosphere. Such effects are known to take place, for example, prolonged intervals a near-radial heliospheric field (Jones et al., 1998) have been explained by Riley and Gosling (2007) in terms of this effect. Because this was based on the frozen-in flux theorem and simple kinematics, Lockwood et al. (2009c) termed this the "kinematic correction". It is a simple, single correction that has to account for the variety of effects discussed above; however, because for all of these phenomena the frozen-in theorem applies, it does have the potential to make allowance for them. Short period Alfvén waves, and any other small-scale polarity structure, on timescales shorter than 1 h was averaged out by using T = 1 h. The lines in Figure 18 give the distribution of the variations in Δ predicted using the kinematic correction for each Carrington rotation from the mean variability in solar wind speed, Δ : it can be seen that these fit the distribution of observed values very well. Figure 19 compares the raw data shown in Figure 17 (for → 0 top panel) with results obtained using T = 1 day (middle panel) and using T = 1 h with the kinematic correction of Lockwood et al. (2009b) (bottom panel). In all cases the thick black line is the Ulysses data, normalised to = 1 by multiplying by ( / 1 ) 2 , and the thin line is from the simultaneous ACE data in the ecliptic plane. The middle panel shows that using an averaging timescale T = 1 day before taking the modulus has reduced all the ACE data, which is expected because ACE is within the streamer belt and observed both Toward and Away field which cancels to some degree when T = 1 day is used. The Ulysses values are also reduced in the same way, particularly while it is within the streamer belt, but also to a lesser extent while the spacecraft is within the polar coronal holes: hence although good agreement is now obtained when both craft are within the streamer belt, this is not the case when Ulysses is in the polar coronal holes. The third panel shows the effect of applying the kinematic correction computed from the solar wind variability Δ using the equations of Lockwood et al. (2009b). With this correction, the Ulysses data are no longer showing enhanced values where Δ is enhanced and good agreement is obtained between the Ulysses and ACE data throughout the pass.
As noted above, the open solar flux is the value of ( ) for = 2.5 ⊙ , inserting this into Equation (7) and re-arranging, we get an equation that allows us to use near-Earth IMF data to compute the open solar flux: where Δ is the excess flux between the coronal source surface and near-Earth space (defined by Equation (7) because Δ = Δ for = 2.5 ⊙ ). Lockwood and Owens (2009) surveyed all the Ulysses data and showed that using T = 1 h and the kinematic correction to evaluate Δ reduced the overall error in , computed using Equation (8), to just ±2.5%. As mentioned above, the PFSS method has also been used to compute open solar flux. In particular, Sheeley Jr (1995, 2003a) and Wang et al. (2000) used the Ulysses result to compare the value derived from near-Earth in-situ observations with PFSS-derived values and obtained good agreement. (Note that these authors actually compared ⟨| | ⟩ rather than ). However, two caveats to this comparison should be noted. First, the magnetogram data required re-processing using a latitude-dependent instrument saturation factor (Wang and Sheeley Jr, 1995) which has long been the subject of some debate Ulrich, 1992;Riley et al., 2013). Secondly, these authors used an averaging timescale of T = 1 day on the in-situ data  The lines show the distribution of predicted Δ from the "kinematic correction" calculated for each sample (Lockwood et al., 2009c), which are exceeded a fraction of the time, where is 0.1 (for black line), 0.25, 0.5, 0.75, and 0.9 (light gray line). Image reproduced by permission from Lockwood et al. (2009c), copyright by AGU.
before taking the modulus. This means, effectively, that they used Equation (8) with T = 1 day, for which Δ ≈ 0. The middle panel of Figure 19 shows that this works reasonably well within the streamer belt. Lockwood et al. (2006b) showed explicitly the effect of the value of T on this comparison and that T = 1 day is the optimum value on average. However, although it makes Δ go to zero on average, it is an approximation. Effectively, it is chosen as a compromise that averages out much of the heliospherically-generated component in ⟨| | ⟩, without averaging out too much of the source sector structure. Figure 20 shows that the difference between the PFSS-derived values of open solar flux by Sheeley Jr (1995, 2003a) and the mean value of ( 1 ) (for T = 1 h) is consistent with the rise in ( ) values with found using a wide variety of spacecraft in the heliosphere, as found by Owens et al. (2008a) (also using T = 1 h) and shown in Figure 18. Hence the PFSS data also strongly support the kinematic correction. The degree to which the PFSS data and the near-Earth data are consistent (if the kinematic correction is deployed) is underlined by the variation of derived annual means shown in Figure 21. The agreement is very good and better than using the T = 1 d, Δ = 0 approximation.
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4  Figure 19: Application of the "kinematic correction" to the third Ulysses perihelion pass shown in Figure 17. All three panels show 27-day running means of various estimates of the absolute radial field (normalized to = 1 = 1 AU) from measurements by Ulysses (thick black lines) and ACE (thin black lines). Ulysses data from the top panel are shown by the area shaded grey in all three panels to facilitate comparisons. a) the asymptotic limit of the timescale 7.5 The use of the modulus of the radial field Smith (2011Smith ( , 2013 argues that the use of the modulus of the radial field causes the excess flux and that the use of the kinematic correction is unnecessary. The first point is a somewhat semantic one but does have some validity. However, the second point does not follow from the first because it assumes that there is a problem-free and error-free alternative, which is not true. In fact the word "causes" is somewhat misleading: what taking the modulus (after averaging over an interval T ) really does is cancel opposite polarity within the interval T . The alternative advocated by Smith (2011Smith ( , 2013 is that be averaged over the durations of toward and away source field sectors. This is indeed, in principle, absolutely correct; however, it pre-supposes that in implementation one knows where the source sector boundaries lie. We here call this method "the variable-T method", because it is identical to employing the modulus, except that the interval durations used T are not fixed but are varied to cover the source sector durations. It is important to stress here that it is the sector boundaries at the coronal source surface, not at the observing spacecraft, that matter:  show that if the polarity changes at the spacecraft are used to define the sectors then the result is mathematically identical to that obtained by taking the modulus. Therefore, to implement the variable-T method, decisions are required about where the source sector boundaries are and where they map to on the satellite orbit, and what is a genuine source sector rather than heliospherically-imposed polarity structure. It is not adequate or acceptable to sidestep this issue by saying that source sector boundary crossings can be readily identified. For example, using electrons with energies > 2 keV to determine the connectivity between near-Earth space and the source surface, Kahler et al. (1996)  examples of opposite polarity field within well-defined source sectors and also found that in all cases they showed bidirectional electron streaming which they associated with CMEs on a wide range of spatial scales. For CMEs within an inferred source sector, the flux would be cancelled out by using the variable-T method, but would thread the source surface and hence CMEs are one source of potential error. The net heat flux from the electron data show regions where the heat flux is towards the Sun. These unambiguously reveal "folded flux" (in which source toward/away field is folded so it points away from/toward the Sun at greater (e.g., . Folded flux revealed by the suprathermal electron flows is often found in the vicinity of sector boundaries (Kahler et al., 1998;Crooker et al., 2004) along with seemingly plasmoidal structures (Foullon et al., 2011) (which may, in some cases, actually be folded flux that has latitudinal structure). In addition,  have recently shown folded flux is also associated with pseudostreamers embedded within IMF polarity sectors. As a result of folded flux, Kahler and Lin (1995) deduced that some sector boundaries did not show local field reversals at the spacecraft and some field reversals were seen away from the true sector boundaries. Thus defining where the source sector crossing point is from the associated polarity change in at the spacecraft is not straightforward and would contribute to an unknown error to the derived open solar flux.
Thus, for the derivations of open solar flux from interplanetary craft to be repeatable, a full catalogue of assumed source sector crossings would be needed for the variable-T method and the error introduced into open solar flux estimates by inaccuracies in that catalogue would not be known. On the other hand, using the modulus with kinematic correction (or the easier-toimplement T = 1 day method) is a repeatable algorithm that does not depend on a catalogue Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Variations in the Open Solar Magnetic Flux 41 of sector crossings. Another advantage of the modulus approach is that it makes the issue of heliospherically-produced radial field structure explicit and does not hide it in the choice of the intervals to average over in the variable-T method.  argue that the variable-T and modulus approaches are actually complementary, each with its own strengths and weaknesses, and that it is neither correct nor helpful to advocate the use of one over the other: rather, it is important to fully understand the strengths, limitations and applicability of both. What is most interesting is that the two approaches generate similar answers in several important respects (for example, both give the Ulysses result of the latitudinal constancy of the radial field.) In addition,  point out that both methods average along a sector of the spacecraft orbit and so both are subject to errors (that are different in the two cases) associated with latitudinal variations in folded flux tubes.
It is here worth recording that there is another approach discussed in the literature which allows values of from PFSS to be matched to the values derived from in-situ spacecraft measurements. Erdős and Balogh (2012) analysed the field in a reference frame aligned to the Parker spiral direction, as computed from the frozen-in flux theorem using the observed solar wind flow speed (see Equation 5). Like the fixed-T (modulus) methods it has the advantage of being repeatable without implementing a catalogue of source sector boundary crossings. However, it should be noted that disconnected and folded flux is often well-aligned with the Parker spiral (Crooker et al., 2004;Foullon et al., 2011; and that the authors used an averaging interval of T = 6 h to match the open solar flux estimates derived from the satellite to those from the PFSS values, and hence there is an element of fixed-T averaging in this method also.

Regression Techniques
Because the reconstructions all rely on finding relationships between modern geomagnetic activity data and simultaneous measurements of near-Earth space, regression techniques are needed to enable extrapolation to before the space age. The discussion in the literature between , Lockwood et al. (2006a), and Svalgaard and Cliver (2006) highlights the many pitfalls in this area. 1 This discussion was on the use of the index. Svalgaard and Cliver (2005) (SC05) used to conclude that increased by only 25% between the 1900s to the 1950s and that this was in contrast to the more than doubling of which they argued was inherent in the results of Lockwood et al. (1999a). Lockwood et al. (2006a) (LEA06) pointed out that some of this difference was due to the fact that Lockwood et al. (1999a) actually reported a doubling in the open solar flux, , not (as shown later by Figure 29, is not proportional to ). However, there were several other factors, all of which worked in the same direction and so combined to make the estimate of the drift by SC05 exceptionally low. SC05 employed a simple ordinary linear least squares (OLS) regression which yielded residuals that showed heteroscedasticity, some nonlinearity, and a systematic bias and which do not have a Gaussian distribution, thereby violating central assumptions of least-squares regression and showing the derived fit is unreliable. The regression results of SC05 were strongly influenced by outliers, which applied great leverage to their regression fit. More reliable regressions were obtained by LEA06 using least median squares (LMS) regression and, better still, using Bayesian statistics (the BLS procedure employed by REA07). In addition, SC05 attempted to fill in the data gaps in the IMF data using a 27-day recurrence technique, despite the relatively low autocorrelation functions of the IMF at 27 day lags, and LEA06 show that this also caused a slight underestimation of the long-term drift (piecewise removal of the data during IMF data gaps is much more reliable). Lastly SC05 under-estimated the long-term change in their own results.
The initial response by Svalgaard and Cliver (2006) did not accept these arguments, but as shown in the following sections, a subsequent reconstruction by Svalgaard and Cliver (2010) is in very good agreement with the Lockwood et al. (1999a) reconstruction. In fact, the change between the Svalgaard and Cliver (2010) and SC05 reconstructions of IMF was almost exactly what was called for by the residuals analysis of LEA06. This change was caused by the availability of just four additional annual mean datapoints near the long and low minimum between cycles 23 and 24 for which was low. The fact that change was needed in response to the addition of just a few more datapoints confirms that the original SC05 fit was not robust.
Because the potential pitfalls in regression techniques can have such a major effect on the reconstructions, it is worth exploring the relative merits of the various linear regression procedures used in this context. Figure 22 stresses how much they can differ, showing the scatter plot and the various regressions between annual means of the index and the IMF, . SC05 used OLS but the slope they derived is slightly lower than LEA06's implementation of OLS because of their different treatment of data gaps. OLS gives the lowest slope, whereas BLS gives the largest.
The details of the regression procedures (with appropriate references for the statistical techniques) and discussion of their relative merits and pitfalls are given in the paper by LEA06. The advantage of the LMS procedure is that it is not as influenced by outliers that can change the slope of the fit dramatically if they have a high value of the Cook-D leverage factor. The MAA (Major Axis Analysis) procedure is inappropriate in the context of these reconstructions and the BLS procedure is as employed by Rouillard et al. (2007) (REA07). The tests described below show that BLS performed best.
Notably, the OLS procedure used by SC05 gives lowest slope in Figure 22, and so would give   Svalgaard and Cliver (2006) of the fit residuals in the fit of against by . The fit residuals are plotted as a function of the fitted value, which is the correct test for homoscedasticity. Image reproduced by permission from Svalgaard and Cliver (2006), copyright by AGU.
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 the lowest long-term drift in the reconstruction of open solar flux. There are a number of ways of evaluating the quality of a regression fit. One of the most important is to check that the fit residuals are randomly and normally distributed: the fit of against used by SC05 is analysed in Figures 23, 24, and 25.
Fits should be homoscedastic, i.e., the residuals should not show a trend in their spread. In their reply, Svalgaard and Cliver (2006) quite rightly state that this should be tested by plotting fit residuals against the fitted values. Figure 23 shows this residual plot which they claim shows the fit is homoscedastic because the mean of the residuals does not change with the fitted value. However, homoscedasticity requires the spread of fit residuals (not their mean value) does not change with the fitted value, and Figure 23 shows the spread does increase with increasing fitted values. Hence the fit is heteroscedastic rather than homoscedastic. In addition, the plot shows a marked tendency towards an inverted-U form which is characteristic of some nonlinearity.

Ordinary least squares (OLS)
Bayesian Least Squares (BLS) A second test used by LEA06 was quantile-quantile (QQ) plots to test if the residuals were normally distributed, as is assumed by all least squares regressions. The standardized residuals are placed in order by size and plotted against the quantiles for a standard normal distribution. The deviations from the straight line of slope 1 reveal departures from a normal distribution. Figure 24 shows that the OLS fit gave considerably larger deviations from a Gaussian distribution of residuals than did the BLS fit and so the BLS method is giving the more valid least-squares regression.
Because they found the SC05 fit was heteroscedastic, potentially nonlinear and failed the QQ test for normally-distributed residuals, LEA06 tested for a trend in the residuals by plotting the fit residuals as a function of the observed values. The results are shown in Figure 23 for three regression prcodures. The BLS regression meets the requirement that there is no trend (and LEA06 show the LMS regression does as well) but the MAA and OLS fail this bias test. They underestimate the trend in the data because fit is consistently an overestimate of obs when obs is small and consistently an underestimate of obs when it is large. Thus, reconstructions based on this OLS fit will self-evidently underestimate the true range of variation in IMF . . The fit residuals ( obs − fit ) are shown as a function of obs , where obs is the observed value and fit is the fitted value from the index. The dashed line is the linear regression fit to the points to highlight trends. The OLS and MAA fit residuals show strong trends and so should not be used because fit is consistently an overestimate of obs when obs is small and consistently an underestimate of obs when it is large. Hence, these fits seriously underestimate the real trend in the data. In contrast, the BLS regression is free of such a trend.

Living Reviews in Solar Physics
The better the correlation between two parameters, the more similar will be the results of the various regression procedures and, hence, these tests would become increasingly less important. Given that small changes in the fitted slope will make very large differences to the maximum and minimum values seen in a reconstruction, it is very important that these tests are carried out to ensure that the optimum regression procedure has been used and any one regression fit is valid.
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 The previous sections give the physical principles that allow us to reconstruct annual means of heliospheric parameters from geomagnetic activity. By using a combination of geomagnetic activity indices that have differing sensitivities to the substorm phenomenon (for example, one range index and one based on interdiurnal variability can be used) both the IMF magnitude and the solar wind speed, SW , can be derived. From Parker spiral theory the modulus of the radial field | | can be deduced and using the Ulysses result, this means the open solar flux, can be reconstructed too. The first paper to exploit this possibility was Lockwood et al. (1999a) who reconstructed the open solar flux, and found considerable variation with 11-year running means around 1985 more than twice those found around 1900. A wide variety of different procedures have subsequently been used. The most obvious differences between them are the basis geomagnetic data employed. However, there are other differences. In this section we look at the resulting reconstructions of , SW , and . Because the method of SC10 is relatively straightforward, making use of a single correlation, it has the advantage that it is much easier to study the propagation of uncertainties and the grey area shows the error in the SC10 reconstruction, as evaluated by Lockwood and Owens (2011) from the regression error estimates by Svalgaard and Cliver (2010). These estimates are made using the standard equations for uncertainty in the slope and intercept of a linear regressions which are approximate (Richter, 1995) and do not allow for experimental uncertainties in the data used. A more rigorous examination of uncertainty is presented in Section 9.4.2. The orange line (LEA99) was not published in the paper by Lockwood et al. (1999a), which focussed on open solar flux. However, Lockwood and Owens (2011) have extended the procedure of Lockwood et al. (1999a) to evaluate , the results of which is shown here. In this procedure, only the index was used, and the effect of SW on this range index was allowed for using the recurrence index of (based on the autocorrelation of at 27-day lag). This works because in the streamer belt, annual means of SW are enhanced by the intersection with fast solar wind streams emanating from isolated low-latitude coronal holes or from low-latitude extensions to polar coronal holes (Sheeley Jr et al., 1976;Wang and Sheeley Jr, 1990). These fast streams interact with the slow solar wind ahead of them and, because the coronal holes generally persist for several solar rotations, cause co-rotating interaction regions which give recurrent geomagnetic disturbances, so increasing the recurrence index. As a result, recurrence indices (e.g., Sargent, 1986) are correlated with the average solar wind speed.

Results for the near-Earth IMF
It can be seen that, despite the wide variety of implementations and the variety of different geomagnetic data used, the reconstructions of agree remarkably well and all agree well with the observed values from the space age. As expected, the differences get somewhat larger as one goes back in time. The year 1901 appears to either contain some error in one or more of the data series that is propagated into some of the reconstructions, or has exposed a limitation in one of the procedures. The paper by Rouillard et al. (2007) offers some insight into this as they applied both ordinary least-squares regression (OLS) and Bayesian least-square regression (BLS) techniques to two combinations of indices: with and with . Their Figure 5 reveals that a shallow minimum is found for 1901 using the with pairing with OLS. The same pairing gives a somewhat deeper minimum if BLS is used and the variation shown in Figure 26  ; LEA09 (red) is by Lockwood et al. (2009d) and uses and ; REA07 (blue) is by Rouillard et al. (2007) and also uses and ; SC10 (thin black line with uncertainty shown by the surrounding grey area) is by Svalgaard and Cliver (2010) and uses . Solid circles show annual means of IMF observations from the OMNI2 database. Note that Lockwood et al. (1999a) did not reconstruct and the LEA99 variation shown was generated by Lockwood and Owens (2011) who adapted the Lockwood et al. (1999a) procedure to predict . Image reproduced by permission from Lockwood and Owens (2011), copyright by AGU.
found using the with pairing and, again, is somewhat deeper if BLS rather than OLS is used. The figure shows that using alone gave a lower value but Figure 33 shows the (1 ) index with a non-linear fit and full error analysis (Lockwood et al., 2013b) gives a similar shallower 1901 minimum to that from an OLS fit to (Svalgaard and Cliver, 2010). In itself the larger spread of estimates for this one year (1901) is not significant, other than it does highlight that uncertainties are larger when the number of available stations is lower.

Results for the near-Earth solar wind speed
The SC10 and LEA99 papers do not reconstruct the solar wind speed variation but Svalgaard et al. (2003), Svalgaard and Cliver (2007a) and REA07 did. Furthermore, REA07 used a number of different procedures to check how robust their reconstructions are: their results for SW are shown in Figure 27. Specifically they employed two different regression procedures: ordinary least squares (OLS) and Bayesian Least Squared (BLS). They also used two different combinations of geomagnetic indices: with and with , which as illustrated Figures 3 and 16, have differing dependencies on the solar wind speed.
The derived variations in annual means of SW are very similar in all four cases, as is the Svalgaard and Cliver (2007a) reconstruction (not shown). All show a weak upward trend on average during the past century, but this trend is not as strong as that in the IMF . This agrees with recent inferences from Wang and Sheeley Jr (2012) that the lower IMF in low-activity cycles would be accompanied by lower solar wind number density, but not significantly lower solar wind Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Figure 27: Reconstructions of annual means of the near-Earth solar wind speed SW from geomagnetic activity data. The upper panel is for ordinary least squares (OLS) regression, the lower panel for Bayesian least squares regression (BLS). Plots to the left are the time series, those to the right give the distribution of residuals of the fit to interplanetary in-situ observations. Blue and red lines are derived using and whereas green and black are using and . Solid circles show annual means of solar wind speed observations from the OMNI2 database. Image reproduced by permission from Rouillard et al. (2007), copyright by AGU.
speed. Peaks in SW are seen in the declining phase of the solar cycles. In the space age there has been a marked tendency for these to be larger for even numbered cycles than odd-numbered ones (Hapgood, 1993) but this is not seen in the reconstruction for before the space age. This agrees with the inference from the 27-day recurrence in the index (see the lower panel of Figure 1 of Lockwood et al., 1999a). As yet we have no explanation for this difference between even-and odd-numbered cycles, nor why it appears to be intermittent on centennial timescales. Figure 28 shows the open solar flux reconstructions corresponding to the IMF reconstructions shown in Figure 26. Note that Svalgaard and Cliver (2010) did not compute because the main focus of their paper was the IMF , but their reconstruction (and the uncertainty band around it) has here been converted into open solar flux using the polynomial fit in Figure 29 (see below). Again the agreement is generally good, but larger differences do exist than for the IMF reconstructions. In particular, the original reconstruction by Lockwood et al. (1999a) Figure 28) was derived using the index and hourly mean data of the IMF, with no kinematic correction to the | | 1 h values. It can be seen this gives larger values than the best open solar flux values from in-situ data which do deploy the kinematic correction (the black dots in Figure 28). The green line in the figure (labelled LEA ) shows the results of applying the LEA99 procedure to the index and using kinematically-corrected IMF values. Comparing the green and orange lines it can be seen that applying these corrections has lowered the open solar flux estimates at all times, but the effect is greatest for the modern data (after 1957). Before 1957 (which is when the move of the northern hemisphere station from Abinger to Hartland generates the major difference between and ) the difference between the two is not as great. As in Figure 26, the red and blue lines are from Lockwood et al. (2009d) (LEA09) and Rouillard et al. (2007) (REA07). It can be seen that the LEA99 procedure, when applied to the same data as used by LEA09 and REA07 generates very similar results, despite being based on and the 27-day recurrence index, whereas LEA09 and REA07 are based on combining and . Given it is based on their , it is not surprising that the SC10 reconstruction is, as for , slightly larger than the others in the earliest years; nevertheless, agreement is remarkably close overall. . See text for details. The variations are as they appear in the publications except that attributed to SC10, which is their variation in , converted to using the polynomial fit shown in Figure 29. The green line (LEA99 ) is derived from the LEA procedure, applied to the index and to interplanetary data with the kinematic correction applied to | | 1 h . The solid circles are the values derived from interplanetary observations using the kinematic correction, described by Lockwood et al. (2009c). Figure 29 shows the variation of with used to convert the SC10 data. This is a scatter plot of the data from the LEA09 reconstruction for 1905 -2009 (black dots) and from the insitu spacecraft data (open triangles). The black line is a polynomial fit (given by equation 8 of Lockwood and Owens, 2011), constrained to pass through the origin because if the open solar flux ever fell to zero, the near-Earth IMF would necessarily also fall to zero. This fit varies considerably from the best fit linear regression, shown by the dot-dash line. The form of the polynomial fit is readily understood in terms of the competition between two effects (Lockwood et al., 2009d). The Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 first is what would be seen for uniform solar wind flow (over a 27-day period), as predicted by Parker spiral theory. Sections 9.1 and 9.2 show that as the average IMF rose over the past 150 years, the average solar wind speed SW also rose slightly. This causes the spiral field to unwind such that the ratio | |/ rises and, hence, the ratio | |/ also rises as rises. This is consistent with the sense of the curvature in non-linear behavior seen in the data and the polynomial fit at below about 6 nT. However, at above 6 nT, the ratio | |/ falls slightly as continues to increase. This is consistent with an increased kinematic effect due to increased longitudinal structure in the solar wind at higher solar activity which will increase the at = 1 AU for a given . There is a point that has caused some confusion and needs clarifying here. The non-linearity of open solar flux and near-Earth IMF means that the radial component of the near-Earth IMF is not linearly related to . The original reconstruction of by Lockwood et al. (1999a) used not only a linear relationship but proportionality between and (by assuming that on annual mean timescales the gardenhose angle was constant) and so approximated the data in Figure 29 with = . However, this approximation was used to derive an analytic form for that was then fitted to the and Lockwood and Owens (2011) have shown that, although this influences the fit coefficients, it does not greatly alter the derived . In other words, = was a reasonable approximation to make in this context. However, Figure 29 shows that proportionality is an approximation and cannot be relied upon in general.  (8) with T = 1 h and the kinematic correction for Δ ). The black line is a polynomial fit to both datasets, constrained to pass through the origin ( , ) = (0, 0), with an uncertainty given by the grey area. The dot-dash line is a linear regression fit. The vertical solid line labelled SEA MM shows the value of during the Maunder minimum derived using cosmogenic isotopes by Steinhilber et al. (2010) and the vertical dashed lines bound the uncertainty in that estimate. Image reproduced by permission from Lockwood and Owens (2011), copyright by AGU.

Results for the open solar flux
Thus, there is considerable agreement between the various reconstructions of both the open solar flux and the near-Earth interplanetary field. The main difference is that the SC10 reconstruction gives slightly but persistently higher values in the early years, but we should expect agreement to Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 be less good at these times as the number of stations available, and the long-term stability of their instrumentation is necessarily lower for the early data. SC10 extend their sequence back to 1835, just three years after the establishment of the first geomagnetic observatory: discussion of the validity of this extension is presented in Section 9.4.2. The LEA99 reconstruction extends back to 1868 because they used the index only and this is the date at which Mayaud began his analysis of range at two antipodal stations. This sequence has been extended back to 1844 using data from a single station (Helsinki) by Nevanlinna and Kataja (1993) and Nevanlinna (2004);Lockwood (2003) used this to extend the open solar flux back to this date. LEA09 are more conservative in that they used the index which only uses datasets and composites that extend into the era of space measurements and they argued that the hourly mean (or hourly spot value) data that meet this criterion too few and of insufficient accuracy before 1902 (for example giving the uncertainty in 1901). There is considerable (if not complete) agreement after 1901 and so this gives more than 100 years of reconstruction that can be used to train and evaluate models of the long term variation in the IMF and open solar flux, and these are discussed in Section 11. These models are based on the longest series of as-it-happened observations available to us, which is of sunspot number (see the Living Review by Hathaway, 2010). . The data sequence was initially compiled by Hoyt and Schatten (1998). A number of possible adjustments to this sequence have been proposed recently, based on newly discovered historic observations. Reviewing these, after a consensus view has been reached, will be an important update to this and other living reviews. However, one adjustment, by Vaquero et al. (2011) is Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 already included in Figure 30 as this makes the decline into Maunder minimum conditions more consistent with cosmogenic isotope data . The upper dashed line in the top panel shows the "floor" in annual means of the near-Earth IMF , of 4.6 nT postulated by Svalgaard and Cliver (2007b) (SC07). In this context, the author believes it is important to make a clear distinction between a genuine floor value (set by mechanisms which prevent the value of a given parameter from going any lower) and the minimum value detected since a certain date. The point is that without firm and quantitative understanding of the postulated mechanisms one can never be sure that lower values have not been seen only because the required conditions have not prevailled within the period for which one has data. Whilst it is almost certainly true that there is likely to always be some flux emergence (which means that there would always be some open flux and a non-zero near-Earth IMF) there is, as yet, no known physical reason that would allow one to quantify a minimum floor value. The estimate of 4.6 nT by Svalgaard and Cliver (2007a) was based on the fact that their reconstruction (SC07) did not go below this value. In fact, in the recent low solar minimum, annual values of fell to 3.9 nT in 2009 and so Svalgaard and Cliver (2010) (SC10) revised their floor value down to 4.0 nT (which is the lowest value for calendar years), which is also shown in Figure 30. Subsequently, Cliver and Ling (2011) have generated a new estimate of a floor IMF value of about 2.8 nT in annual means, based on more sophisticated empirical arguments, but the physical origin of any such a quantified limit remains unknown. The middle panel of Figure 30 shows the open solar flux reconstructions and the floor values have been mapped from the upper panel using the polynomial fit shown in Figure 29. One point to note is that a linear fit to the data shown in Figure 29 does not set a floor value at the intercept. The reason is that this intercept is at = 0 and ≈ 2 nT. No source for the near Earth IMF, other than the coronal source flux, has ever been suggested and, hence, if = 0 then = 0 also. Hence, if a linear fit to Figure 29 is argued to be evidence for a floor value, then an explanation of where the ≈ 2 nT come from, as it cannot be from the Sun. Much more realistic is that it does come from the Sun and that the relationship between open solar flux and near Earth IMF is not linear. In Section 10, the non-linear fit in Figure 29 is used to estimate the open solar flux during the Maunder minimum from cosmogenic isotope data.
All the reconstructions in Figure 30 show general variations with sunspot number, not just over the solar cycle but on centennial scales as well. The key difference between the sunspot variation and those derived for and is that sunspot activity indices return to a value close to zero every minimum (not exactly zero, there is a small long term drift in the minimum values that mirrors those in the maxima and in the 11-year running means). In contrast, both and show variability in the cycle minimum values which almost matches that in the solar maximum values. The realisation that the Sun does not return to the same baselevel state at each solar cycle minimum, even though it is (almost) clear of spots then, is an important change in our understanding of long-term solar variability. In using the two reconstructions ( and ), two points should be remembered. (1) The open solar flux has the advantage of being a global value that applies to the whole heliosphere whereas the IMF is a local value that applies only near the Earth (so, for example, it varies as the solar wind speed increases/decreases, making the Parker spiral unwind/tighten, respectively). (2) On the other hand, mapping from the near-Earth measurements back to the coronal source surface causes, as discussed in Section 7, its own complications and uncertainties. Hence, the IMF has the advantage of being much more straightforward observationally.
As noted above, the major differences between the reconstructions is before 1880, for when the SC10 is slightly, but consistently, higher than the other reconstructions, thereby giving less longterm trend and a higher floor value (at least over the period since 1835). Note, however, that the other reconstructions do still (just) agree within the computed uncertainty in SC10. Considerable effort is being expended deploying more datasets to try to resolve this discrepancy. However, I urge some caution here. Some of the early data are of higher quality and better long-term stability Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Variations in the Open Solar Magnetic Flux 53 than others, and so great care must be taken to ensure bad data is not used to corrupt good data. The author's personal view is that it may well be better to look at the present and the future to evaluate the reconstructions. As predicted as early as 2005 from the observed polar fields by , cycle 24 is proving to be a very weak cycle (e.g., Lockwood et al., 2012) and it is instructive to look at the latest 12-month means available at the time of writing (31 March 2013). These are shown by the black dots in the three panels of Figure 30. The and values are taken from the IMF observations by the ACE spacecraft and the value of is taken from the daily means of the International Sunspot Number (compiled by SIDC, Belgium), linearly regressed against for the years when both are available. All the current indications are that this value is close to the maximum value  which means that the current cycle (number 24) is similar in magnitude to cycle number 14 (which peaked around 1908). Hence, it is illuminating to compare the current observed values of and with the reconstructions for close to the peak of cycle 14. The best agreement is with LEA09 (in red) and the SC10 is already significantly higher at this time -a trend that continues as one goes back in time. Thus, the recent long and low minimum between solar cycles 23 and 24 (Russell et al., 2010;Lockwood, 2010) and the weakness of cycle 24 thus far  are likely to discriminate between the reconstructions much more effectively than the implementation of many corrections of the pre-1900 data. The evolution of cycle 24 will be monitored and updated in Section 12. The sunspot numbers seen already in cycle 24 are still considerably larger than were seen during the Dalton minimum (marked DM in the bottom panel of Figure 30) and, of course, the Maunder minimum (MM), it therefore seems highly unlikely indeed that and did not dip under any minimum values in data recorded after 1835.

Analysis of uncertainty
The homogeneous construction of the (1 ) composite by Lockwood et al. (2013a) allows a detailed analysis of uncertainties in the reconstructions that are based on it. In evaluating these uncertainties we need to allow for errors in both the interplanetary data and in the geomagnetic index, their effect on the regression fits and the subsequent effect on the reconstructions. Lockwood et al. (2013b) have carried out a comprehensive evaluation of errors in the reconstruction of IMF from (1 ). The largest error in the interplanetary data is associated with the fact that the geomagnetic index responds to sin 4 ( /2) (or some equivalent coupling function that quantifies the southward IMF component in GSM) but we are attempting to reconstruct . As discussed in Section 5, the average of the ratio of the two tends to a constant on annual time scales, but part (c) of Figure 10 demonstrates that there is an error associated with employing this average that is of order 10%. There is also a much smaller measurement error which has been estimated from comparisons of measurements of from different spacecraft to be of order 0.2 nT. Figure 31 presents an analysis of the errors in (1 ). Because is compiled from over 50 stations in modern times, it is reasonable to assume that most of the differences between (1 ) and the appropriately scaled are due to errors in (1 ), hence the distribution of the residuals of the fit of onto (1 ) gives us an uncertainty estimate in (1 ). This distribution for the space age is shown in Figure 31 and has a standard deviation of = 0.459 nT. Lockwood et al. (2013b) use a Monte-Carlo method to carry out a non-linear regression fit between (1 ) and and evaluate the uncertainties. The points shown in Figure 32 are annual means (with piecewise removal of data during datagaps and, hence, the parameters are denoted with a prime) which are fitted with a polynomial of form given in Equation (9).
In each fit, the values of , , and that yield the minimum r.m.s. difference between the Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 observed and predicted IMF values ( ′ and , respectively) are determined using the Nelder-Mead search method (Nelder and Mead, 1965). This fit was carried out 100 000 times, each time each point being perturbed individually by randomly-selected errors in both ′ and (1 ) ′ , such that the errors in ′ follow the normal distribution shown in part (c) of Figure 10 and the errors in (1 ) ′ follow the normal distribution shown in Figure 31. An additional error, drawn at random from a normal distribution of standard deviation 0.2 nT is added to ′ to allow for IMF measurement uncertainties. For the full range of potential (1 ) values, the median, 95-percentile, and 5-percentile were evaluated from the 100 000 fits and taken to be the best fit (the blue line in Figure 32) and the 2 uncertainty limits (which bound the grey area in Figure 32), respectively. The correlation between ′ and the best-fit from (1 ) ′ is 0.947. The maximum possible correlation is set by the correlation between and sin 4 ( /2), which is 0.957 and hence of the unexplained variation of 100(1 − 0.947 2 ) = 10.3%, 100(1 − 0.957 2 ) = 8.4% is caused by the variation oin the IMF orientation factor.
The uncertainty band is wide at low values of (1 ) as there are no data to constrain the fit there. The procedure does produce quasi-linear fits (for which is close to unity), but these are rare in the ensemble and so are close to, or beyond, the 2 level. These linear fits produce a nonzero intercept in when (1 ) falls to zero. This would mean that geomagnetic activity falls to zero when the annual mean falls below about 3nT (in annual means) and there is no known reason why this would occur. In contrast, the best non-linear fits give an intercept in (1 ) if fell to zero: this does make sense as it means that there is a baselevel level of geomagnetic activity driven by solar wind buffeting and phenomena such as Kelvin-Helmholtz waves on the boundary, on top of which reconnection-driven effects are added.
Because it delineates the 2 points, then 90% of the observed data points should lie within that grey band in Figure 32 if the error estimations are correct (with 5% above the band and 5% below the band). In fact, this is true for 22 out of the 30 data points (73%). However, there is an additional factor which has not yet been allowed for which is a factor in the fit to the space-age data but which would not be a factor in reconstructing the IMF from the (1 )  (1 ) ′ − ) . The blue line is the median of 100 000 best polynomial fits and grey area defines the 2 uncertainty band, derived using a Monte-Carlo technique, allowing for the distributions of uncertainties introduced by the IMF orientation factor 4 ( /2) and the experimental uncertainties in both and (1 ). The linear correlation coefficient of fitted and observed IMF is 0.947. The error bars on datapoints allow for the effect of the datagaps. data gaps have been allowed for by piecewise removal of data, they still have an effect because there are (semi)annual and UT variations in the geomagnetic activity response to a given set of interplanetary conditions due to the effects of Earth's dipole tilt. If we have full data coverage, these variations are not a factor as they are averaged out in annual means. However, datagaps mean they will have an effect, depending on the UT and time-of-year at which those datagaps occur. To simulate this, the ratio of annual means of ′ and was evaluated for the continuous interplanetary data after 1995 but with data gaps synthetically introduced at random in such a way as to reproduce the observed distribution of gap durations in the OMNI2 dataset. Repeating this many times over allows statistical evaluation of the uncertainty in caused by gaps in the IMF data, as a function of the total data coverage. Using the observed coverage, uncertainties can be assigned to annual means and these are shown by the error bars in Figure 32. Allowing for these error bars, 27 of the 30 (90%) are consistent with the grey band and this meets the 2 design criterion. Figure 33 shows the reconstruction and its uncertainty from this fit. The tacit assumption is that the relationship between the (1 ) index and found in the space age (as shown in Figure 32) applies at all other times. This is where the fact that the construction of (1 ) is homogeneous is so important as it gives the greatest possible confidence that this is true. The best-fit reconstruction of using the polynomial fit is the black line. The grey area surrounding this black line is the uncertainty band associated with this and is derived using the grey band in Figure 32. In addition, the uncertainties introduced into (1 ) by the intercalibration of the stations are allowed for. This is achieved by applying the upper 2 fit to the upper limit of (1 ) and the lower 2 fit to the lower limit of (1 ). As discussed above, these confidence limits are defined to be at the 95% level. (1 ) (Lockwood et al., 2013a) and the polynomial fit to . The grey area surrounding this black line is the uncertainty band associated with using this polynomial fit derived using a Monte-Carlo technique (see Figure 32 and text for details) and also includes the uncertainty caused by the inter-calibration of the (1 ) stations. The red line shows the best reconstruction using a linear fit. The green line shows the reconstruction of Svalgaard and Cliver (2010). Blue dots show the annual means of the observed IMF (from Lockwood et al., 2013b). fit: it is very similar to the results of the polynomial fit for the observed range of (1 ). The green line shows the reconstruction of Svalgaard and Cliver (2010), including the early extension using Bartels' index. The blue dots show the annual means of the IMF data. It can be seen that agreement between the two reconstructions is exceptionally good between 1880 and the present day (including 1901). This is despite the fact that different geomagnetic indices and different fit procedures were used by Lockwood et al. (2013b) and Svalgaard and Cliver (2010). Therefore, there is a real and strong consensus about the IMF reconstruction after this date. However, before 1880 there are some differences. Before 1872, the Svalgaard and Cliver (2010) reconstruction is using the Bartels' index about which Bartels himself expressed some reservations. On the other hand, the Lockwood et al. (2013b) reconstruction is based on data from the Helsinki observatory (Nevanlinna, 2004) which has passed a number of self-consistency checks (Lockwood et al., 2013a) and is very well correlated with corresponding data from Russian observatories operating at the same time (Nevanlinna and Häkkinen, 2010).
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 Variations in the Open Solar Magnetic Flux 57 10 Comparison with Cosmogenic Isotopes Galactic cosmic rays hitting Earth's atmosphere generate radionuclides by spallation (Beer et al., 2012). Some of these cosmogenic isotopes are stored in terrestrial reservoirs (notably 10 Be in ice sheets and 14 C in tree trunks), into which dateable cores can be drilled. Because the flux of cosmic rays is modulated by the heliospheric field (Parker, 1965;Potgieter, 1998Potgieter, , 2013, the abundances of these isotopes gives unique information on the long-term variability of the Sun (O'Brien, 1979;Stuiver and Quay, 1980;O'Brien et al., 1991;Beer, 2000;Muscheler et al., 2007;McCracken and Beer, 2007;McCracken, 2007;Solanki et al., 2004) once the effects of the secular variation in the geomagnetic field (which also shields Earth's atmosphere from cosmic rays) have been accounted for (Bhattacharyya and Mitra, 1997;Masarik and Beer, 1999). Comprehensive reviews of the methods and the underpinning science are present in Usoskin (2013) and Beer et al. (2012). It is useful to employ both 10 Be and 14 C because the deposition into their respective reservoirs is completely different and checking for close agreement between the inferred production rates can eliminate the possibility of signals in the record caused by changes in Earth's climate (Bard et al., 1997). Gleeson and Axford (1968) showed, with some approximations, that cosmic rays behave as if they were modulated by an electric field that shields them away from the inner heliosphere. This led to the concept of the solar modulation potential which is now thought of as a parameter (in units of MV) which describes the heliospheric modification of the local interstellar spectrum (LIS) of galactic cosmic rays at the Earth (Caballero-Lopez and Moraal, 2004;Usoskin et al., 2005). Note that increases with increased levels of solar activity such that the fluxes of cosmic rays at Earth fall. An excellent review of long term variability of the Sun and heliosphere, as derived from cosmogenic isotopes is given in the Living Review by Usoskin (2013) and so that material will not be repeated here. However, cosmogenic isotopes do provide an independent way of testing (and extending back in time) the reconstructions presented here and so a brief comparison is worthwhile. Lockwood (2001) noted that the open solar flux reconstruction of Lockwood et al. (1999a) overlapped with cosmogenic isotope records, in a way that modern data on cosmic rays from neutron monitors (Simpson, 2000) did not. A good anticorrelation was found by Lockwood (2001Lockwood ( , 2003 on both solar cycle and centennial timescales, with the upward drift in open solar flux reflected in the downward drift in cosmogenic isotope abundances in terrestrial reservoirs, and also the drift in results from early ionisation chambers (Forbush, 1958;Neher et al., 1953;McCracken and McDonald, 2001). This trend can also be detected in the cosmogenic 44 Ti isotope found in meteorites (Bonino et al., 1995;Taricco et al., 2006;Usoskin et al., 2006) which is significant as it finally removes any possibility that the trend is associated with climate change influence on deposition into terrestrial reservoirs. Usoskin et al. (2006) use the 44 Ti isotope data to give strong support to models of the evolution of heliospheric fields based on sunspot number (first introduced by Solanki et al., 2000), as discussed Section 11. This work showed that the well-known Hale cycle variation in cosmic ray fluxes detected using neutron monitors (with alternately peaked and then plateau-like maxima at sunspot minimum) was also well matched by the inverse of the open solar flux variation (see, in particular, Figure 2 of Rouillard and Lockwood, 2004). The anticorrelation with near-Earth IMF had been noted by Cane et al. (1999) and Belov (2000). Furthermore, Thomas et al. (2013) has shown that this feature is also present in the and reconstructions from geomagnetic activity. This raises an interesting question, which remains largely unresolved, as to the relative influences of cosmic ray drifts in the heliosphere and of the open solar flux on the modulation of cosmic rays arriving at Earth, both on decadal and centennial time scales. That the open solar flux is a factor is not a surprise as cosmic rays are scattered off irregularities in the heliospheric field and those irregularities are known to scale in amplitude with the average field value and, as shown for near-Earth space by Figure 29, that field scales with the . The drift theory is very well established (e.g., Jokipii et al., 1977;Jokipii, 1991;McDonald et al., 1993) and has some notable successes; for example, the antiphase Hale cycle seen in electrons (Evenson, 1998) Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4  Figure 26, are shown using the same colour scheme as in that figure. Also shown are the SC07 and SC10 floor estimates for annual means and the Dalton and Maunder sunspot minima are labelled DM and MM. Image reproduced by permission from Lockwood and Owens (2011), copyright by AGU. and positrons (Clem and Everson, 2002) and their latitudinal variations (Heber et al., 1999). If it is assumed that these drift effects contribute to the Hale cycle but not the secular drift, their effect can be averaged out by taking means over the Hale cycle . Using ice core records of the abundance of the 10 Be cosmogenic isotope and a simple theory of cosmic ray shielding, Steinhilber et al. (2010) (SEA10) have reconstructed 25-year means of the IMF over the last 9300 years. The results since the Maunder minimum are shown in Figure 34 and compared with 11-year running means of from the reconstructions discussed in Section 9.1.
The general agreement between the geomagnetic and cosmogenic isotope reconstructions is extremely good although there are obvious differences and there may be some timing errors which may turn out to be attributable to dating problems with the ice cores. The agreement is very good after 1900 but less good before then. Between 1850 and 1875 the SC10 reconstruction agrees well with the average level of the SEA10 reconstruction, although showing oscillations that are not found in the SEA10 data. However, SC10 yields higher values of in the intervals 1875 -1905 and 1835 -1850. The 25-year means of the SEA reconstruction of remain above the SC10 postulated floor level for annual means, even in the Dalton minimum (DM). However, this is not true of the Maunder minimum (MM) where they fell well below it. Even in 25-year means the SEA10 estimate fell to 1.80 ± 0.59 nT by the end of the Maunder minimum, which is still lower than the downward revision of the floor estimate to 2.8 nT by Cliver and Ling (2011). Extending the sequence over 9300 years, SEA10 find 14 grand solar minima in which the reconstructed fell to even lower values in 25 year means. The SEA10 value of (and its uncertainty) at the end of the Maunder minimum is marked by the white dot in Figure 29, and using the polynomial fit The open flux continuity model discussed in Section 11, which was derived to explain and fit the open solar flux reconstructions from geomagnetic activity data, has been used to estimate the variation of sunspot numbers from the cosmogenic data isotope data for the last millennium (Usoskin et al., 2003;Solanki et al., 2004). These studies found that the recent grand maximum contained unusually high sunspot numbers in the past 11 000 years, a conclusion that generated some debate (Raisbeck and Yiou, 2004;Usoskin et al., 2004;Muscheler et al., 2005;Solanki et al., 2005). Using the composite of cosmogenic isotope data compiled by Steinhilber et al. (2008), Abreu et al. (2008) found that the recent grand solar maximum may not have been the largest in the sequence, but it was the longest in duration.

Models of Open Solar Flux Variation
The extension of the coronal field into the heliosphere and, hence, the modelling of the open solar flux variation, has been recently covered in another Living Review by Owens and Forsyth (2013). Therefore, as in the last section, only a brief review will be given here, to stress the extent to which the reconstructions are both feeding into the models and providing tests of them.
A number of theoretical concepts for the evolution of the heliospheric magnetic field have been proposed. Fisk (1999) argue that the Sun's open flux tends to be conserved, with "interchange reconnection" (see Crooker et al., 2002) between open and closed solar fields resulting in an effective diffusion of open flux across the solar surface without, necessarily, any net change in the total open flux. In this case, the heliospheric field evolves with simple rotation of regions of positive and negative polarity separated by a single, large-scale heliospheric current sheet (Fisk and Schwadron, 2001;Jones et al., 2003). That this "Fisk circulation" can conserve the open flux does not mean there are not other processes that act simultaneously to cause it to grow and decay. It has been argued that emerging midlatitude bipoles cause closed coronal loops to rise and first destroy preexisting open flux in the polar coronal hole (remnant from the previous solar cycle) and then build up a new polar coronal hole (of the opposite polarity) and so reverse the polar field of the Sun (Babcock, 1961;Wang and Sheeley Jr, 2003b), which fits well with the migration of photospheric fields seen in magnetograph data (see the Living Review by Sheeley Jr, 2005). The evolution of the heliospheric magnetic field could also be facilitated by transient events (Low, 2001): specifically, Crooker (2006, 2007) and  investigated the role of the magnetic flux contained in coronal mass ejections (CMEs) in the observed variation in flux seen by craft in the heliosphere. These different concepts are not mutually exclusive in many respects (see review by . One complicating factor in this debate has been semantics: "open flux" in this review, and in many previous papers, is taken to be the same as coronal source flux; that is, the magnetic flux that leaves the solar atmosphere and enters the heliosphere by threading the coronal source surface at = 2.5 ⊙ . As discussed in this review, it is a measurable quantity because of PFSS modeling (within the assumptions of that technique) and because of the Ulysses result allows the use of in situ magnetic field data. This is quite different from another definition of open flux which requires that it has only one footpoint still (effectively) attached to the Sun (e.g., Schwadron et al., 2008). Flux which appears to be in this category can sometimes be inferred for in-situ point measurements, for example, from heat flux or unidirectional strahl electron distribution functions (although scattering by heliospheric structure into other populations such as halo often makes this far from unambiguous) (Larson et al., 1997;Fitzenreiter et al., 1998;Owens et al., 2008b). Even if this could be done reliably, there is no way to quantify the total of such flux at any one time from such in situ point measurements. This is because there is no equivalent of the Ulysses result to generalize in situ point measurements into a global quantity. Lockwood et al. (2009c) have reviewed how various phenomena (coronal mass ejections, interchange reconnection, disconnection reconnection) influence both these two definitions of open flux.
The long-term change in the open flux (meaning coronal source flux, ) deduced from geomagnetic activity has been reproduced by a number of numerical models of flux continuity and transport during the solar magnetic cycle, given the variation in photospheric emergence rate indicated by sunspot numbers (Solanki et al., 2000Schrijver et al., 2002;Lean et al., 2002;Mackay and Lockwood, 2002;Wang and Sheeley Jr, 2003b;Wang et al., 2005). The key fundamental principle was established by Solanki et al. (2000), namely the continuity of total open solar flux: over the Sun and how it changes, although that will undoubtedly influence the loss rate greatly and could influence the source term as well. In order to extend the modelling back to the Maunder minimum, the group sunspot number, has generally been used in some form to quantify . An initial concern was that even a model as simple as Equation (10) may have too many free variables to be meaningful. However, it should be noted that the model was "trained", and the coefficients defined, using the LEA99 open solar flux reconstruction that extended up to 1995. Therefore, although the first perihelion pass of Ulysses (between day 280 of 1994 and day 235 of 1995) was not independent data, it is significant that the model reproduced the open solar flux detected during the second perihelion pass (between day 353 of 2000 and day 301 of 2001, i.e., roughly half a solar cycle later) (Lockwood, 2003) and the third perihelion pass (almost a full solar cycle later). The model, therefore, has real predictive capability. Solanki et al. (2002) extended the modelling by adding more classifications of closed solar flux, each governed by its own continuity equations, to define the emergence rate . This has been very useful in allowing centennial reconstructions of total and spectral solar irradiance which are, ultimately, constrained by the open solar flux reconstructions from geomagnetic activity.
Various forms have been used for the loss rate . Solanki et al. (2000Solanki et al. ( , 2002 and Vieira and Solanki (2010) used a constant fractional loss, i.e., = / where is the loss time constant. In addition, a constant absolute loss rate has been used by Connick et al. (2011) and a fractional loss rate that varies over the solar cycle by Owens et al. (2011a). In particular, working from the assumption that the source term was set by the CME emergence rate,  showed that the loss rate needed was cyclic over the solar cycle and was very well correlated with the tilt of the heliospheric current sheet, as predicted theoretically by Sheeley Jr and Wang (2001) and Owens et al. (2011a), and consistent with the observations that streamer belt disconnection events, as seen in coronograph images, tend to occur where the current sheet is tilted (Wang et al., 1999b,a;Sheeley Jr and Wang, 2001). Owens and  noted that during the long low minimum between cycles 23 and 24, CME flux emergence continued, and they postulated that this was a base-level emergence rate that would have continued at all times, including during the Maunder minimum. Using this they derived the modelled variation of open solar flux shown by the grey area Figure 35. Note that the plot shows the unsigned open flux and so is 2 . The modelled signed open solar flux at the end of Maunder minimum oscillated around a mean of about = 0.7 × 10 14 Wb. This is a bit larger than, but still consistent within uncertainties, with the estimate from the SEA10 reconstruction and Figure 29 of (0.48 ± 0.29) × 10 14 Wb. The geomagnetic reconstruction has also been extended back by cross-correlating decadal means with the corresponding decadal means of the heliospheric modulation potentials derived from cosmogenic isotope abundances. This has here been done for both linear and third-order fits (solid and dashed lines, respectively) and for both 10 Be and 14 C cosmogenic isotope records (red and blue, respectively). The model is close to all these empirical extrapolations. Using the polynomial fit shown in Figure 29, the average annual mean = 0.7 × 10 14 Wb for the end of the Maunder minimum modelled by  yields an IMF of = 2.5 nT. This is quite similar to the floor estimate of Cliver and Ling (2011) of 2.8 nT. Hence, the baselevel CME emergence rate postulated by  comes close to offering a potential explanation of at least the smaller floor estimates. As shown by Figure 35 this postulate matches cosmogenic isotope data quite well; however, it remains only a postulate. Note that, in addition, that the cosmogenic isotopes tell us that the Maunder minimum is not the lowest level of solar activity reached in the last 9300 years, and so it remains possible that this minimum still does not set a genuine floor limit to the IMF.
Interestingly, the modelled open solar flux shows cyclic variations during the Maunder minimum. The long time constants of exchange of carbon with the two great terrestrial reservoirs (the biomass and the oceans) means that solar cycle variations cannot be seen in 14 C data, but the same is not true for 10 Be (Beer et al., 1990). One puzzling observation had been that 10 Be continued to show decadal-scale oscillations during the Maunder minimum when no evidence for a magnetic cycle can be found in sunspot data (Beer et al., 1998). Even more puzzling was that, whereas the 10 Be abundances at other times are, as expected, in antiphase with sunspots numbers, at the start and end of the Maunder minimum, the 10 Be oscillations are in phase with the (small) sunspot cycles (Usoskin et al., 2001).  show how the modelling presented in Figure  "It is not important to predict the future, but it is important to be prepared for it" -Pericles, Athenian orator, statesman and general (c. 495 -429 BC) I have some difficulties with this quote as I don't believe you can prepare for something if you do not know what it is. Better is: "It is not important to know the future, but to shape it" -Antoine de Saint Exupéry, French writer and aviator (1900 -1944) which I can see can be valid in many areas of life such as sport, warfare, economics, and politics. However, it is invalid in solar-terrestrial physics. What undoubtedly does apply is: "Prediction is very hard, especially when it is about the future" -Niels Bohr, Danish Physicist (1885-1962 But perhaps the wisest of all is a variant on Bohr's quote: "Never make predictions, especially about the future" -Lawrence ('Yogi') Berra, American Baseball Player, coach and author, (1925 -) Ignoring the obvious wisdom of Berra's advice, this last section uses the knowledge outlined in the previous sections to look into the probable future of solar activity.
A great many papers have looked at predicting sunspot numbers, particularly those expected at the start of a new cycle, and the methods employed have been presented in the Living Review by Petrovay (2010). Because of the difficulty in making such sunspot number predictions for just the cycle ahead, the degree to which there is some predictability in several solar activity indices on longer timescales has not been exploited. Lockwood et al. (2011b) have used the predictability measure based on autocorrelation functions devised by Hong and Billings (1999) to show that although the predictability of sunspot numbers is indeed relatively low, it is greater over longer lags for the open solar flux and the solar modulation potential . Lockwood et al. (2011b) find that this predictability is great enough over sufficiently long lags to allow forecasting of the onset of a grand solar minimum such as the Maunder minimum.
The SEA10 reconstruction of near-Earth heliospheric field discussed in Section 10 was derived from the homogenised composite of the heliospheric modulation parameter from 10 Be abundance sequences in various ice cores compiled by Steinhilber et al. (2008). Figure 36 shows the full composite in 25-year means 25 , which covers 9300 years.
The present value of 25 is near 600 MV, which Figure 36 reveals to be high compared to the average value for this interval. The 25 = 600 MV level is shown by the horizontal orange line and intervals when 25 exceeded this are shaded red. Lockwood (2010) and Barnard et al. (2011) use this as a threshold value to define a "Grand Solar Maxima" (GSM) as this definition means the GSM that has persisted through the space age has recently come to an end. It can be seen that such GSMs are relatively rare, indeed there are just 24 of these prior to the recent one if we adopt this definition (an average repeat period of about 390 years, but there is a large spread about this mean value as their occurrence is far from regular and these GSMs are notably more common in the second half of this composite). The times when they end (when 25 falls back below the 600 MV level) are shown by the vertical lines. The Maunder minimum (MM) is also marked and the lowest 25 within it MM is marked by the horizontal green line. It can be seen that the Maunder minimum is not the lowest in the sequence, and although they are more evenly spread than the GSMs, the minima are also more common in the second half of the interval. There are 12 grand minima that are at least as deep (in terms of the minimum 25 within them) as the Maunder minimum (average repeat period 780 years) and 30 that are at least as deep as the Dalton Minimum (average repeat period 310 years). Abreu et al. (2008) noted that at the end of the data composite of 25-year averages (the last data point being in 1994), the Sun was still within the recent GSM but that this maximum had lasted longer than other in the past 9300 years and 25 was currently declining, consistent with the decline commencing in 1985 noted by Lockwood (2003) and discussed in the context of global climate change by Lockwood and Fröhlich (2007). Abreu et al. (2008) concluded the GSM was likely to come to an end in the near future, a conclusion supported from the reconstructed and directly-observed open solar flux and IMF by Lockwood et al. (2009d). The discussion below shows that recent neutron monitor data reveal the recent GSM actually did end in 2001 in the 25-year means for the adopted definition.
As yet we have no predictive models of the solar dynamo that can model this time series (see the Living Review by Charbonneau, 2010) and so the only ways to use these data to predict the future remain almost exclusively empirical. Two methods have been deployed.  have recently used two types of spectral techniques whereas Lockwood (2010) and Barnard et al. (2011) have made "analogue forecasts". The results of these three methods are remarkably similar and lead to the same general conclusions. This section exemplifies those conclusions using analogue forecasts, which are based on studying how 25 has behaved in the past following a situation analogous to the present day. The situation is defined to be analogous if 25 falls to 600 MV having previously exceeded this value: further restrictions could be applied but because the recent 25 have been so high, even this only gives 24 previous analogues in 9300 years. The times = when 25 falls back down to 600 MV (the vertical lines in Figure 36) have been "composited" (also called a "superposed-epoch" or "Chree" analysis) such that they are all at time Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 zero in Figure 37, and the black lines show the variations around this time for the 24 previous GSMs. Also shown in Figure 37 are 22-year running means of the annual heliospheric modulation potential reconstruction for the past century, compiled by Usoskin et al. (2002) from modern neutron monitor data, ionisation chamber records and cosmogenic isotope records (red line) and of the modulation potential from the Oulu neutron monitor counts, from a third-order polynomial fit to the red line for the interval when they overlap. The horizontal dashed line is the value of MM shown in Figure 36. The blue line confirms that 22 has recently dropped below the 600 MV level and so, by this definition, the recent GSM has come to an end. In addition Figure 37 also implies that the recent decrease in solar activity (meaning the long, low minimum between cycles 23 and 24 and the weak cycle 24 thus far) has given a decline more rapid to exit a GSM than any seen in the past 9300 years. Lockwood (2010) noted that in two cases of the 24, 25 fell below MM within 40 years ( − < 40 yr) and so concluded there was a 2/24 ≈ 8% chance of a Maunder-like minimum in the next 40 years. We can make estimates of the probability of any level being exceeded ( [≥ 25 ] = 1− [< 25 ]) into the future by counting the fraction of the composited 25 values that exceed a certain value at each epoch time ( − ). By interpolation this can be used to predict the evolution of 25 at a given level of probability. Lockwood et al. (2011a) and Barnard et al. (2011) used empirical regressions and theoretical relations between 25 and 25-year means of other parameters (either observed such as and or reconstructed such as ) and then, for each, evaluated the fractional deviation of annual mean values from the 25-year means as a function of the solar cycle phase, . Hence, for a given 25 and solar cycle phase , an annual value could be computed. Using the 25 at a certain probability and assuming all future cycles are 11 years in duration (to prescribe ), annual values of at a given probability level could be predicted into the future. The same procedure was applied to the predicted 25-year means of other parameters. The results for the Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 various probability levels are shown by the coloured lines in Figure 38 which also gives the past variations which are either observed (in black) or reconstructed (in mauve). 95% 85% 75% 65% 55% 45% 35% 25% 15% 5% interplanetary magnetic field strength at Earth, ; cosmic ray counts by the Oulu neutron monitor, nm; and the geomagnetic index. The black lines are monthly averages of observations. The mauve line in the second panel is the LEA09 reconstruction of annual means of from geomagnetic data by Lockwood et al. (2009d), and that in the third panel from the reconstruction of by Usoskin et al. (2002). The red-to-blue lines show predicted variations of annual means at various probabilities, made from the 9300year cosmogenic isotope composite of Steinhilber et al. (2008) using the procedure developed by Lockwood et al. (2011a) and Barnard et al. (2011). In the top panel the blue to red lines show the values of which have a probability [< ] = [0.05 : 0.1 : 0.95] that will be lower than the value shown. Corresponding predictions are given in the other panels. (Note that in the third panel the probability of [> nm] is shown as nm rises as solar activity falls). Image adapted from Lockwood et al. (2012).
The plot shows that there is only a 5% probability that solar activity cycles will remain as large as or exceed recent cycles but that, at the other extreme, there is a 5% probability that they will fall to Maunder minimum levels within just under 40 years. The most likely scenario is between the yellow and green lines which places the next grand minimum some time after 2060. An interesting question becomes how is cycle 24 evolving in relation to these predictions? To answer that question, one first has to establish where in the cycle 24 we currently are. This is here done using the method described by Owens et al. (2011b) who noted from the Greenwich/USAF sunspot data (Hathaway, 2010) that the variation of sunspot latitude with solar cycle phase was very similar, independent of the amplitude of the cycle. For most past cycles the mean latitude of the spots has been roughly equal in the northern and southern solar hemispheres, but a complication is that cycle 24 is proving exceptional in that the southern hemisphere is lagging considerably behind the northern . As a result, the conclusion from the northern hemisphere mean sunspot latitude is that cycle 24 has passed its peak but for the southern hemisphere is that it is imminent (as of 20 April 2013). Figure 39 takes an independent look at the evolution of cycle 24 by analysing the solar polar magnetic fields. The timing of the polar field reversal, relative to sunspot maximum, was first Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 observed during solar cycle 19 (SC19) by Babcock (1959) using data from the Hale Solar Laboratory (HSL) magnetograph. He noted that the average field emerging from the south solar pole reversed polarity between March and July 1957 and that in the north pole reversed in November 1958. The 12-month running mean of monthly sunspot number peaked in March 1958, midway between these two reversals. Figure 39 employs the continuous data on the solar polar field available from the Wilcox Solar Observatory (WSO). As noted by Babcock during SC19, the two poles do not reverse at exactly the same date, and the raw data are also complicated by a strong annual periodicity introduced by the annual variation in Earth's heliographic latitude. Because of these two effects, the average polar field reversals are most readily seen by taking the difference between the north and south fields, ( N − S ). In order to give the variations of this difference the same appearance in each cycle, thereby allowing easy comparisons, the upper panel of Figure 39 shows ( N − S ) multiplied by , where = +1 for odd-numbered cycles and = −1 for even ones: the variation of ( N − S ) with solar cycle phase, (determined using the average of the absolute values of the northern and southern mean sunspot latitudes), is plotted in the top panel for the WSO measurements, which are made every 10 days. The area shaded gray is between the earliest (lowest ) reversal which was seen during cycle 23 (green line) and the latest possible reversal date which was the brief return to ( N − S ) = 0 during cycle 22 (blue line). (However, notice that the best estimate of the reversal for cycle 22 was at considerably lower ). The lower panel shows − fN and fS where fN and fS are the northern and southern polar field variations after they have been passed through a 20 nHz low-pass filter to smooth them and remove the annual variation. The vertical lines give the phases of the corresponding cycle peaks in 12-point running means of monthly sunspot numbers. Red, blue, and green are used to denote cycles SC21, SC22, and SC23 and black is for SC24. The Figure shows that the polar fields during SC24 thus far have been weaker than they were in the corresponding phase of the previous three cycles. It is noticeable that for the odd-numbered cycles the reversal of the poles is within roughly a month of the smoothed sunspot number peak. However, for the even-numbered cycles the polarity reversal took place considerably after the cycle peak.
Note that the predictions made by , Lockwood (2010) and Barnard et al. (2011) are probabilistic rather than deterministic (or categorical) in nature. Many areas of geophysics, including weather and flood forecasting (Krzysztofowicz, 1998;Bartholmes et al., 2009), have concluded that probabilistic forecasting is more powerful in decision-making, that they are usually scientifically more honest, and that for applications they enable risk-based assessments. Tests of forecast skill have been developed but probabilistic forecasts are not yet in widespread use in solar, heliospheric, and solar-terrestrial science which has remained more deterministic in approach. Their increasing use would be a natural part of the development of solar-terrestrial physics into applications-based "space weather". Figure 40, like Figure 39, will be updated as the cycle progresses. Panels (a), (c), and (d) show monthly means (in grey) and 12-month running means (in black) of the international sunspot number , the Oulu neutron monitor cosmic ray counts nm , and the observed IMF , respectively. These are compared to the predictions shown in Figure 38, presented using the same colour scheme to give the probability of the parameter being lower than the value shown. Panel (b) shows the evolution of the mean sunspot latitude in the northern (in blue) and southern (in red) hemispheres. The current date is shown by the vertical black dashed line (after which the dashed lines show linear extrapolations based on the prior data for cycle 24). The circles on each line mark the latitudes when peak sunspot area in that hemisphere was observed during previous cycles. The top panel shows the difference between the two polar fields, ( N − S) as a function solar cycle phase, , as determined from mean sunspot latitudes using the method described by Owens et al. (2011b), where N and S are the average fields seen over the north and south solar poles, respectively, and = +1 for odd-numbered cycles and = −1 for even ones. The reversals all occur within the grey band and the phases of the peak sunspot number in 12-month running means are given by the vertical lines. The lower panel shows − fN (solid lines) and fS (dashed lines) as a function of where fN and fS are the N and S data that have been passed through a 20 nHz low-pass filter. In both panels, red, blue, green and black denotes solar cycles SC21, SC22, SC23, and SC24, respectively. This plot will be updated regularly as the cycle progresses: this version was generated on 20 April 2013.

Living Reviews in Solar Physics
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 95% 85% 75% 65% 55% 45% 35% 25% 15% 5% R a).  Figure 40: The evolution of solar cycle 24 thus far. Observed monthly means (in grey) and 12-point running means of monthly data (in black) of (a) sunspot number, ; (c) Oulu neutron monitor counts, nm and (d) the observed near-Earth IMF field strength, . In each of these plots the coloured lines are the predicted levels at various probabilities between 95% (in red) and 5% (in mauve), as derived by Barnard et al. (2011). The value shown is that which has a (100 − )% chance of being exceeded in the case of and and a % chance of being exceeded in the case of the comic ray counts. Panel (b) shows the monthly mean latitudes of sunspots groups ⟨ ⟩. The red line is for the southern solar hemisphere, the blue for the north and solid line is observed whereas the dashed is a linear extrapolation of the cycle 24 behaviour into the future. The circles show the mean latitude for that hemisphere at which peak sunspot number was seen in cycles 12 -23: the number of open circles to the right of the dashed line is the number of those cycles that had not yet reached their peak sunspot number in that hemisphere for the latest shown mean latitude of spots in that hemisphere. This plot will be updated regularly as the cycle progresses: this version was generated on 20 April 2013.
Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 12.1 Solar cycle 24 update 1: 20 April 2013 At present, all the data shown in Figure 40 (sunspot number, , Oulu neutron monitor cosmic ray flux, nm , and near-Earth IMF are following the predicted blue lines. These are for low (5 -10%) [< ], [> nm ], and [< ] values. Thus, the decline in solar activity is very much at the more rapid end of the range predicted from the analogue forecasts, which is consistent with an earlier onset of the next grand minimum. However, it must be stressed there is no dynamo science in these analogue forecasts and without a full understanding of the physics of the long-term changes described in this review, we can have no confidence that an observed trend will continue.
All the indicators are that cycle 24 is close to, or has past, its peak. The northern polar field has flipped and the northern hemisphere spots have migrated equatorward to a latitude below that for sunspot maximum in all but 3 of the 12 cycles for which we have data on spot latitudes. In the southern hemisphere the polar field flip appears imminent, but has yet to occur and the spots have migrated equatorward to a latitude below that for sunspot maximum in 6 of the 12 cycles. Thus, from average sunspot latitudes, the northern solar hemisphere indicates a probability of 75% that the sunspot maximum of cycle 24 has already occurred at this date and the southern hemisphere gives a probability of 50%.
12.2 Solar Cycle 24 Update 2: 1 August 2013 Figure 41 is an updated version of Figure 39 and the upper panel shows a significant development in that the polarity the difference between the northern and southern solar polar fields has reversed. If this is the final reversal (it did flip briefly earlier in the cycle) it means that is at a slightly greater phase of the cycle than we have seen before, but is only slightly later than during cycle 22. The filtered data in the bottom panel (that are effectively extrapolations for the most recent data) indicate that although the northern polar field has flipped, the southern has still yet to do so. The corresponding update to Figure 40 is shown in Figure 42. Part (a) shows that in the period since the previous update, the monthly sunspot number has shown an increase but the 12-month running means are still considerably below the 95-percentile prediction (blue line). Similarly, part (d) shows that the IMF also remains well below this line in 12-month running means. Part (c) shows that there have been some Forbush decreases that have lowered the monthly mean cosmic ray count (mirroring the rise in ) but the 12-month running mean remains between the 90 and 95 percentiles. The average sunspot latitudes have decreased such that for the northern hemisphere only 1 of the 12 previous cycles has a lower average latitude at cycle maximum and the corresponding number for the southern hemisphere has fallen to 5. Thus, the average sunspot latitudes in the northern and southern hemispheres give probabilities of 92% and 58% that solar maximum has already been reached by this date. 95% 85% 75% 65% 55% 45% 35% 25% 15% 5% R a). Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 The purpose of this review is to detail the development of reconstructions of solar and heliospheric magnetic fields from geomagnetic activity data and to make some initial comparisons with the results from cosmogenic isotopes and models based on sunspot number observations. One important aspect of this work on centennial-scale solar variability is that it fills the timescale gap between decadal-scale variations (almost 5 full cycles are now covered by in-situ space measurements) and millennia (covered by cosmogenic isotope data) and so allows modern space-age understanding to be applied to the cosmogenic isotope data in a more precise, insightful and quantitative manner.
A fundamental insight that has accrued is that although the sunspot number returns to an (almost) constant baselevel state at every solar minimum, this disguises the fact that the Sun does not return to the same state at each sunspot minimum: there are long-term variations on coronal and heliospheric fields which mean that one solar minimum is not the same as the next. The cycle-to-cycle variation of the Sun at solar minimum is the basis of the "precursor" method for predicting the peak of the sunspot cycle (Schatten et al., 1978) and using the polar fields  made predictions for cycle 24 which are proving exceptionally accurate. This long-term change influences many space-weather phenomena, such as the galactic cosmic ray flux reaching Earth, ("gradual") solar energetic particles generated ahead of solar coronal mass ejections, and solar-wind interactions that drive solar-terrestrial activity giving phenomena such as Geomagnetically Induced Currents in power grids. Hence, we need to allow for "space climate change" as well as space weather.
Another important realisation has been that information has been lost by grouping the variations of various observation series and indices into a general, catch-all "solar activity" classification. Specifically, although the variations of the various geomagnetic activity indices have a great many similarities (for example, all reflect the decadal-scale sunspot cycle), they are measures of different parts of the currents systems in near-Earth space and they have different dependencies on solar wind parameters. These differences can be exploited to derive new information. The example discussed most here is how combinations of range and interdiurnal variation geomagnetic indices allow us to separate the effects of solar wind speed and the interplanetary magnetic field and so reconstruct both back in time.
The review has covered many pitfalls of the reconstructions from knowing the full provenance of historic data, to limitations of statistical methods. Nevertheless considerable consensus now exists between the various reconstructions for between about 1880 and the present day. Thus, using geomagnetic activity allows us to extend the 50-year sequence of in-situ spacecraft data a further 80 years back in time with great confidence. Before 1880 there are increasing differences. There were fewer stations and they were frequently in increasingly noisy environments and eventually had to be moved as cities expanded around them. Equipment was less accurate and more prone to calibration drifts. Thus, although there is, in theory, information available for a further 75 years back into the past there is necessarily greater uncertainty. This is not to say that the full 185-year record cannot be recovered, but the earliest data will always have greater associated uncertainties.
The most important realisation that these reconstructions have allowed, when combined with cosmogenic isotope data, is that the modern space age has been an unusually active period for the Sun, compared to most of the last millennium. Just how unusual is a matter of on-going debate, but even this is now converging to a consensus view that it has been, until recent years at least, very unusual indeed. That being the case, we should not be surprised that average solar activity levels are now declining and that cycle 24 is weak compared to the others in the space age. The cosmogenic isotope data tell us that this decline is more likely to continue than not. This has great implications for solar physics, solar-terrestrial science and space weather. Some effects of a continuation of the current decline are well known, for example, the cosmic ray flux incident on Earth will rise. However, others are not. It is possible that although large space weather Living Reviews in Solar Physics http://www.livingreviews.org/lrsp-2013-4 "events" may become fewer in number, the largest could become more severe in their terrestrial effects because the CME is ejected into a lower-field heliosphere making the Alfvén Mach number of the event greater and potentially reducing Sun-to-Earth transit times. The long, low minimum between cycles 23 and 24 was only "exceptional" in the context of the space age and may give pointers to other changes that we should now expect. The author's personal view is that this offers great scientific possibilities and the modern observation techniques applied to a quieter Sun will teach us much more than a continuation of the high activity levels seen during cycles 21, 22, and 23. All the evidence is that cycle 24 has just passed its peak, and that peak is a weak one, the development of solar activity into the next minimum will be very interesting to monitor.