1 Introduction

The presence of a magnetic field in the solar atmosphere is the most prominent manifestation of solar variability. Observations of such magnetic variability on the Sun can provide strong observational constraints on the solar dynamo theory, helping to understand the physical mechanisms underlying magnetic flux emergence and evolution. This is particularly interesting on long-term scales, as the solar cycle can offer insights into the complex dynamics of the global dynamo (e.g., Usoskin et al. 2007).

The goal of Space Climate is to describe long-term variations in solar activity and their impact on the heliosphere and the Earth’s environment. The time scale usually used to distinguish the two regimes (i.e., Space Weather and Space Climate) is a few solar rotations (Mursula et al. 2007). Here, we focus on the multi-year time scale relationship between solar activity and near-Earth solar wind properties. To achieve this goal, we utilize the full extent of space–age solar wind observations via the OMNI database (King and Papitashvili 2005). Furthermore, we make use of a century-long dataset of the Ca II K index, a widely employed physical solar activity indicator. This emission index has been demonstrated to be a reliable proxy for the magnetic flux density at the Sun (Schrijver et al. 1989; Ortiz and Rast 2005; Chatzistergos et al. 2019b) and it has also been proven to trace long-term variations in solar activity (e.g., Judge 2006; Bertello et al. 2016; Chatzistergos et al. 2019a).

The first evidence of a strong link between solar wind properties and the solar cycle comes from observations of a period close to 11 years since the very first satellite observations (Siscoe et al. 1978; King 1979; Neugebauer 1981). In later studies, long-term solar wind properties were mostly compared with sunspot numbers, revealing a non-perfect match among the periodicities of the solar cycle and solar wind speed, density, pressure, and magnetic field, further highlighting a delayed response of solar wind signals compared to sunspot numbers (Petrinec et al. 1991; Köhnlein 1996; El-Borie 2002; Katsavrias et al. 2012; Richardson and Cane 2012; Li et al. 2016, 2017; Venzmer and Bothmer 2018; Samsonov et al. 2019). A recent observational analysis of near-Earth solar wind measurements in relation to the Ca II K index was the first to study these properties over the last five solar cycles (Reda et al. 2023b). A 3.2-year lag of solar wind speed with respect to the Ca II K index is found using both cross-correlation and mutual information analysis, while a 3.6-year lag is found between the magnetic proxy and solar wind dynamic pressure. The analysis on the time lag behaviour between the Ca II K index and the same solar wind parameters was further extended in Reda et al. (2023a), studying how their pairwise relative lags vary over the last five solar cycles, with values ranging from 6 years to 1 year.

In Reda et al. (2023a) and Reda et al. (2023b), the Space Climate scales were studied by applying a 37-month moving average to the Ca II K index and solar wind parameters, following the approach used by Köhnlein (1996). However, the use of the moving average as a low-pass filter can remove some relevant features at the Space Climate scales, such as the double maxima present in some solar cycles. In this study, we propose to overcome these issues using the Hilbert–Huang Transform (Huang et al. 1998), and in particular the Empirical Mode Decomposition, to filter out the intrinsic modes with mean periods below 3 years. We, therefore, reproduce a similar analysis as in other studies, studying the delays between the signals using a cross-correlation analysis.

Furthermore, we aim to investigate the causal link between the proxy of solar activity and the characteristics of the solar wind. This causal connection can guide us in exploring the underlying physical mechanisms responsible for this relationship, opening significant prospects for understanding the mechanisms of the solar dynamo at Space Climate scales. For this reason, we intend to use the Transfer Entropy (Schreiber 2000a), a novel approach from information theory recently applied to the analogous complex dynamics of the Earth’s magnetosphere–ionosphere system (Stumpo et al. 2020; Balasis et al. 2023; Stumpo et al. 2023). Transfer entropy can track down the information flow between variables in different directions, thus showing the causality relation between the variables. Although a causal relationship between the driver of solar variability and solar wind properties is expected, such a measure is not assured and can shed light on the parameters of the solar wind that are more influenced. Additionally, this type of analysis provides an independent measure of the solar wind’s response times to solar magnetic variability, which is not constrained by a linear analysis. In particular, in this study, we aim to reanalyse the unfiltered data, sampled monthly, to study the solar wind properties on climatological time scales.

These investigations are crucial in an era of significant expansion of human activity in space. The variations of the space radiation environment, for example, have been forecasted for the next 80 years in the perspective of long-term changes in the Space Climate (Barnard et al. 2011). We are at the beginning of a new era of human exploration of deep space, which also aims to colonize and inhabit environments unprotected by the Earth’s magnetosphere. Solar wind constitutes the main low-energy and high-flux component of charged particles in the space environment and has a significant impact on the possibility of permanently inhabiting remote locations in the solar system, such as the lunar surface in the near future and Mars in the medium term. For this reason, it is essential to understand the physical mechanisms governing the variability of the solar wind on multi-year scales.

The present article is structured as follows: Sect. 2 gives a description of the data used; Sect. 3 explains the techniques adopted for the analysis here performed; Sect. 4 presents the results of the analysis; finally, in Sect. 5, we list and discuss the results obtained.

2 Data

The data we use in the present study, to investigate the relationship between the solar magnetic activity and the near-Earth solar wind, are measurements of the Ca II K index and of the solar wind speed and dynamic pressure. In particular, we use here the monthly averages of the parameters listed above over the period 1965–2021.

The Ca II K index is a physical proxy of solar magnetic activity that accounts for the emission in the K line of Ca II at 393.4 nm. Such a line is originated in the middle solar atmosphere (i.e., the chromosphere) and it is related to the mean chromospheric emission of the Sun. The Ca II K index is one of the more commonly used activity indices and it has been proven to be a great proxy for the line-of-sight (LoS) unsigned magnetic flux density along all the phases of the cycle, and not only when sunspots are present (see, e.g., Schrijver et al. 1989; Ortiz and Rast 2005; Chatzistergos et al. 2019b). Specifically, in this work, we make use of the Ca II K index composite presented and described in Bertello et al. (2016), which is freely accessible from the National Solar Observatory (NSO) website at https://solis.nso.edu/0/iss/. The Ca II K 0.1 nm emission index contains inter-calibrated measures from three different observatories (Kodaikanal Solar Observatory, Sacramento Peak and ISS-SOLIS at NSO), and overall, it covers the time period between February 1907 and October 2017. After that date, the SOLIS facility has been offline and no more data from this instrument are available. However, this dataset has been already extended to April 2021 in Reda et al. (2023b), by making use of the Mg II index (University of Bremen composite), which has been proven to strongly correlate with the Ca II K index (see, e.g., Donnelly et al. 1994; Reda et al. 2021, 2023b).

The near-Earth solar wind data are taken from the OMNI database, which can be accessed at https://omniweb.gsfc.nasa.gov/hw.html. The OMNI dataset is a collection of various near-Earth solar wind parameters, both magnetic and plasma ones, provided with different time resolutions (King and Papitashvili 2005). It is compiled using validated data from several spacecrafts, such as IMP, ISEE, ACE, Wind, and Geotail. Among the set of parameters provided by the OMNI database, the analysis we perform in this study regards two dynamic parameters of the solar wind: speed (\(\mathrm {V_{sw}}\)) and dynamic pressure (\(\mathrm {P_{d,sw}}\)). The latter has been computed starting from the speed (\(\mathrm {V_{sw}}\)) and the ion density (\(\mathrm {n_{i,sw}}\)), as \(\mathrm {P_{d,sw} = 1/2\,m_{p} n_{i,sw} V_{sw}^2}\), where the proton mass \(\mathrm {m_{p}}\) is assumed as the mean ion mass. Data concerning these parameters are available starting from July 1965, thus constituting the main limit for the temporal extension of the analysis we carry out here.

The missing monthly data of Ca II K index, solar wind speed, and dynamic pressure were filled using a simple linear interpolation between the previous and the following monthly data. This procedure allows us to continuously investigate, in the present study, the time interval that goes from July 1965 to April 2021, ensuring to almost fully cover the solar cycles from 20 to 24, together with the beginning of solar cycle 25 (Fig. 1).

Fig. 1
figure 1

Monthly averages of the time series used for this work: Ca II K index (green, top), solar wind speed (red, middle), and solar wind dynamic pressure (blue, bottom) (color figure online)

3 Methods

3.1 The Hilbert–Huang transform: empirical mode decomposition and hilbert spectral analysis

The Empirical Mode Decomposition (EMD), e.g., the first step of the Hilbert–Huang Transform (HHT), has been first introduced by Huang et al. (1998) as an adaptive and a posteriori decomposition method whose decomposition basis is derived via an iterative process, known as sifting process, based on the local properties of signals (Huang et al. 1998).

Let y(t) be a time-dependent signal, the EMD allows us to write

$$\begin{aligned} y(t) = \sum _{k=1}^N c_k(t) + r(t), \end{aligned}$$
(1)

where the set \(\{c_k(t)\}\), named as Intrinsic Mode Functions (IMFs) or empirical modes, forms the decomposition basis, while r(t) is the residue of the decomposition. The latter is a non-oscillating function of time, while an IMF is defined as a function having the same (or differing at most by one) number of extrema and zero crossings and a zero-average mean envelope derived from local maxima and minima envelopes, obtained by interpolating them using a cubic spline (e.g., Huang et al. 1998; Huang and Wu 2008). Table 1 summarizes the main steps of the sifting process.

Table 1 The main steps of the sifting process

The authors proposed in Huang et al. (1998) the following constraints as exit condition to stop the sifting process

$$\begin{aligned} \sigma _{n} = \sum _{j} \frac{\left[ \delta _{n}(t_j) - \delta _{n+1}(t_j)\right] ^2}{\delta _{n}(t_j)^2} < \epsilon , \end{aligned}$$
(2)

being fixed \(\epsilon \in [0.2,0.3]\). This criterion has been refined by Rilling et al. (2003) by the so-called threshold method based on two thresholds, \(\theta _1\) and \(\theta _2\), to guarantee globally small fluctuations (as in Huang et al. 1998) and to avoid locally large excursions (Flandrin et al. 2004).

In this way, a completely adaptive procedure is built, allowing us in deriving embedded oscillations without assuming linearity and/or stationarity. The derived set of empirical modes \(\{c_k(t)\}\) satisfies mathematical requirements of completeness, convergence, and local orthogonality by construction (Huang et al. 1998), while global orthogonality is a posteriori guaranteed, since \(\langle c_k, c_{k^\prime } \rangle = \delta _{kk^\prime }\), being \(\langle \dots \rangle\) the scalar product between functions, and \(\delta _{kk^\prime }\) the Kronecker tensor (e.g., Huang and Wu 2008).

Being derived the set of empirical modes, by means of the so-called Hilbert Transform (HT), i.e., the second step of the HHT, we can write each of them as modulated both in amplitude and in frequency (e.g., Huang et al. 1998). Indeed, given an empirical mode \(c_k(t)\), we can define its Hilbert Transform \(\hat{c}_k(t)\) as

$$\begin{aligned} \hat{c}_k(t) = \frac{1}{\pi } \mathcal {P} \int _0^\infty \frac{c_k(t')}{t - t'} dt', \end{aligned}$$
(3)

where \(\mathcal {P}\) is the Cauchy principal value. By introducing the complex signal

$$\begin{aligned} \zeta _k(t) = c_k(t) + i \, \hat{c}_k(t) = \alpha _k(t) e^{i \, \varphi _k(t)}, \end{aligned}$$
(4)

it follows

$$\begin{aligned} \alpha _k(t)= & {} \sqrt{c_k(t)^2 + \hat{c}_k^2} \end{aligned}$$
(5)
$$\begin{aligned} \varphi _k(t)= & {} \tan ^{-1} \left[ \frac{\hat{c}_k(t)}{c_k(t)} \right] , \end{aligned}$$
(6)

where \(\alpha _k(t)\) and \(\varphi _k(t)\) are the instantaneous amplitude and phase of the \(k-\)th empirical mode, respectively. The definition of instantaneous frequency derives from the instantaneous phase as \(\omega _k(t) = \frac{1}{2 \pi }\frac{d \varphi _k(t)}{dt}\). Similarly, the mean time scale is \(\tau _k = \langle \omega _k^{-1}(t) \rangle _t\), with \(\langle \dots \rangle _t\) identifying the time average.

3.2 Transfer entropy

The notion of cause–effect is a delicate question when data from controlled experiments are not available. This is the case of complex systems in general: when dealing with a system whose complete set of dynamical variables is not known a priori and the state of the system is monitored by some indices (which work as proxies) derived empirically, correlation may be confused with causation.

Data-driven methods for studying the degree of causation have been developed in the recent years. Generally, these methods are based on the notion of predictability, i.e., it is said that X drives Y if the knowledge of X’s past gives us information about Y’s future, but not vice versa. This type of causality is known as predictive causality, and it is restricted to only two variables, X and Y respectively. Mathematically, the concept of predictive causality is expressed through conditional independence, i.e., it is reasonable to assume that X does not drive Y if

$$\begin{aligned} p(Y_t \vert \textbf{Y}_{t-1}^{(k)}; \textbf{X}_{t-\tau }^{(l)}) = p(Y_t \vert \textbf{Y}_{t-1}^{(k)}), \end{aligned}$$
(7)

where \(\textbf{X}_{t-\tau }^{(l)} = \left( X_{t-\tau },..., X_{t-\tau -l}\right)\), \(\textbf{Y}_{t-1}^{(k)} = \left( Y_{t-1},..., Y_{t-1-k}\right)\), \(p(\cdot )\) denotes the probability and \(\tau\) is a time lag. Therefore, to measure predictive causality the idea is to test Eq. (7). One way to quantify the distance between the right-hand side (r.h.s.) and the left-hand side (l.h.s.) of Eq. (7) is using the Kullback–Leibler Divergence. In this case, testing Eq. (7) becomes equivalent to test whether the expression

$$\begin{aligned} T_{X \rightarrow Y}^{(k,l)}(\tau ) = \sum _{Y_{t},\textbf{Y}_{t-1}^{(k)},\textbf{X}_{t-\tau }^{(l)}} p(Y_{t},\textbf{Y}_{t-1}^{(k)},\textbf{X}_{t-\tau }^{(l)})\log \frac{p(Y_{t}\vert \textbf{Y}_{t-1}^{(k)},\textbf{X}_{t-\tau }^{(l)})}{p(Y_{t}\vert \textbf{Y}_{t-1}^{(k)})} \end{aligned}$$
(8)

is different from zero (Schreiber 2000b). Equation (8) is known as transfer entropy (TE). Note that \(T_{X \rightarrow Y}^{(k,l)}(\tau =0) \ne T_{Y \rightarrow X}^{(k,l)}(\tau =0)\), i.e., the transfer entropy is asymmetric as expected from a measure of predictive causality (Schreiber 2000b).

In principle to test causality. one needs to include in the l.h.s. of Eq. (7) all the information available in the universe at time \(t-1\) and all the information available in the universe with the exception of \(\textbf{X}_{t-\tau }^{(l)}\) in the r.h.s (Pearl 2009). For controlled systems, all the variables influencing the measurements of Y’s state are assumed to be known and the transfer entropy becomes measurable. This naturally set a limit to the application of such causal inference technique. For example, in the cases in which all the relevant variables are not known a priori, we can still compute Eq. (8), but it can be different from zero even though the interaction between X and Y is mediated by, e.g., a third variable Z (indirect causation; see, e.g., Bossomaier et al. 2016). Thus, the predictive causality does not generally imply the true cause–effect relationship if the information of the whole set of relevant variables is not available. However, it can be still used to measure lags between variables and directional coupling (Wibral et al. 2013).

From a numerical point of view, a key question is whether or not the values found for the transfer entropy are statistically significant. In our case, the critical value of the transfer entropy \(T^{(*)}\) above which we can reject the null hypothesis is computed by generating surrogate time series satisfying Eq. (7) and with the same statistical properties of X and Y. To achieve this, we create surrogate trials by randomly shuffling the time series \(X_{t-\tau }\). This allows us to estimate the distribution of the null hypothesis, to fix a confidence bound, and to find the critical value \(T^{(*)}\) which adapts to our dataset. The values such that \(T_{X \rightarrow Y}^{(k,l)}(\tau ) > T^{*}\) are considered statistically significant.

4 Results

The results of Empirical Mode Decomposition are shown in Fig. 2 for the Ca II K index, in Fig. 3 for the solar wind speed and in Fig. 4 for the solar wind dynamic pressure. The mode decomposition generates 6 IMFs for the Ca II K index, 7 IMFs for the solar wind speed, and 7 IMFs for the solar wind dynamic pressure.

Fig. 2
figure 2

Empirical mode decomposition of Ca II K index. The top row shows the starting monthly means. The subsequent rows show the successive order IMFs, while the last row shows the residual signal (color figure online)

Fig. 3
figure 3

Empirical mode decomposition of solar wind speed. The top row shows the starting monthly means. The subsequent rows show the successive order IMFs, while the last row shows the residual signal (color figure online)

Fig. 4
figure 4

Empirical mode decomposition of solar wind dynamic pressure. The top row shows the starting monthly means. The subsequent rows show the successive order IMFs, while the last row shows the residual signal (color figure online)

The characteristic time scales (or the average period) of the IMFs obtained with the EMD are shown, for each signal, in Table 2.

Table 2 Characteristic time scales of the extracted IMFs for each signal

The Ca II K index, as expected, displays an intrinsic component (IMF 5) related to the 11-year solar activity cycle, in particular with a mean period of 12.4 years. Also, the solar wind speed and dynamic pressure display a component at solar cycle time scales. For both parameters, it is the IMF 6, corresponding to a mean period of 11.2 years for the speed and 13.9 years for the dynamic pressure. It is interesting to notice that for solar wind speed and dynamic pressure, we obtain a mode with a quasi-biennial periodicity (IMF 4 in both cases), while this is not true for Ca II K index.

To understand which IMF has the highest contribute to the overall variability of the signal, it is possible to compute for each IMF a weighted variance as it follows. The value of the variance of each IMF (\(\mathrm {\sigma ^{2}_{IMF}}\)) is normalized to that of the total signal (\(\mathrm {\sigma ^{2}_{IMF_{tot}}}\)), obtained by summing the contribution of all the IMFs but excluding the residual term, and plotted as a function of the mean period of the IMF itself. Such values are shown for the Ca II K index the solar wind speed and the solar wind dynamic pressure in Fig. 5. We can use this information to quantify the contribution of each IMF to the overall observed behaviour. We focus for our analysis on the IMFs with a mean period over 1 year. In the case of Ca II K index (left panel of Fig. 5), the greatest weighted variance is from IMF 5, the one related to the 11-year cycle. In the case of the solar dynamic pressure (right panel of Fig. 5), the major contribution is from IMF 6, once again the one corresponding to the solar cycle time scales. The same is true for the solar wind speed (central panel of Fig. 5), for which over the yearly time scale, the highest contribution is from IMF 6.

Fig. 5
figure 5

IMF variance weighted on the IMFs total variance as a function of the IMF mean period. The subplots are for Ca II K index (left), solar wind speed (center), and solar wind dynamic pressure (right) (color figure online)

To further investigate the results of the decomposition, it is possible to look at how the power is distributed among the IMFs, searching for the presence of possible power laws. This can be done by plotting, for each IMF, the value of the variance (\(\mathrm {\sigma ^{2}_{IMF}}\)) multiplied by the corresponding mean period (\(\mathrm {P_{IMF}}\)), as a function of the mean period of the IMF itself (log-log scale). This is the analogous of a spectral density. The results are shown for Ca II K index, solar wind speed, and dynamic pressure in Fig. 6. It is possible to notice the presence of a power law, characterized by increasing intensity from yearly to solar cycle scales, thus extending over at least one decade, in all the signals. Although, in this time range, the power law is clearly evident, the behaviour at very high and very low frequencies remains uncertain due to the limited data points available in the plots.

Fig. 6
figure 6

IMF variance multiplied by the mean period as a function of the IMF mean period (log–log scale). The subplots are for Ca II K index (left), solar wind speed (center), and solar wind dynamic pressure (right) (color figure online)

Once the signals have been decomposed via EMD as shown above, it becomes possible to filter the time series of the three parameters by means of the obtained IMFs. To this scope, we subtracted from the monthly time series of Ca II K index, solar wind speed, and dynamic pressure the contribution of the IMFs with mean periods smaller than 3 years. This criterion is chosen to be consistent and to compare the results with the 37-month filtering previously applied in Reda et al. (2023b). In particular, for the Ca II K index, the contribution from IMFs 1–3 have been subtracted, while for both solar wind parameters, we subtract the IMFs 1–4. The comparison between monthly means, 37-month averages, and IMFs filtered data for the three signals is shown in Fig. 7. As it can be seen, the signals obtained by filtering the high-frequency IMFs are consistent with the 37-month moving averages, but they seem to better follow the behaviour of the monthly means with respect to the latter. Indeed, the signals filtered by means of the IMFs retain more information, such as the double solar cycle peak visible in Ca II K index (top panel of Fig. 7).

Fig. 7
figure 7

Monthly averages with superimposed 37-month moving averages and IMFs filtered signals for Ca II K index (top), solar wind speed (middle), and solar wind dynamic pressure (bottom) (color figure online)

Fig. 8
figure 8

Cross-correlation of the IMFs filtered signals. a Cross-correlation between Ca II K index and solar wind speed; b comparison of Ca II K index (green) with solar wind speed (red) shifted backward by 3.1 years; c cross-correlation between Ca II K index and solar wind dynamic pressure; d comparison of Ca II K index with solar wind dynamic pressure (blue) shifted backward by 3.4 years (color figure online)

Fig. 9
figure 9

Scatter plot showing the relationship of Ca II K index with solar wind speed (left panel) and solar wind dynamic pressure (right panel) once shifted by the time lags found with the cross-correlation analysis. In both panels, the color map shows how the relation changes with time, while the black line shows the best linear fit to the data points. The Pearson correlation coefficients are reported on the upper left (color figure online)

According to the analysis performed in Reda et al. (2023b), once the high-frequency components of the signals have been filtered out, it is possible to investigate the time lag between them. To this scope, we use here a cross-correlation analysis considering only positive delay of the solar wind parameters with respect to the activity of the Sun (Ca II K index here). The results of the cross-correlation analysis are shown in Fig. 8. The maximum correlation between Ca II K index and solar wind speed occurs at a time lag of \(3.1 \pm 0.1\) yr, in agreement with the result of \(3.2 \pm 0.1\) yr found in Reda et al. (2023b) using 37-month averaged data. The maximum correlation of Ca II K index with solar wind dynamic pressure, instead, is found for a time lag of \(3.4 \pm 0.1\) yr. This value is in agreement, within the confidence intervals, with the values found in Reda et al. (2023b) with cross-correlation (\(3.6 \pm 0.1\) yr) and mutual information (\(3.4 \pm 0.1\) yr). These findings strengthen the analysis, as the results do not depend on the technique adopted to filter out the high-frequency components.

The scatter plots of Fig. 9 show the relation of Ca II K index with solar wind speed and solar wind dynamic pressure, respectively, once the time lags from the cross-correlation analysis are taken into account. In both figures, the black lines show the best linear fits to the data points. The Pearson’s correlation coefficient is r = 0.57 in the case of the speed and r = 0.56 in the case of the dynamic pressure, indicating in both cases a positive moderate correlation. For the case of Ca II K index with solar wind dynamic pressure, the correlation coefficient is almost equal to that found in Reda et al. (2023b) using 37-month averaged data (r = 0.57), while in the case of solar wind speed, it is smaller compared to the value they found (r = 0.65).

To assess in a stronger framework the results obtained with the present analysis, we compute the transfer entropy by directly employing the monthly averaged data, without applying any filter. This approach allows us to investigate higher order correlation between data, i.e., predictive causality as explained in the previous section. In Fig. 10, we show the information flow, as measured by the transfer entropy, from Ca II K index to \(\mathrm {P_{d,SW}}\) (top-left panel) and vice versa (top-right panel). In both cases, the purple line shows the empirical threshold of 99% arising from the analysis of 500 surrogate time series data (see Sect. 3.2). The information flow from the Ca II K index to \(\mathrm {P_{d,SW}}\) exhibits a statistically significant structure/enhancement between \(\sim\)25 and \(\sim\) 50 months. The maximum of the transfer entropy is at 43 months (\(\simeq\) 3.6-year), while the mean of the interval (assuming a symmetric peak) is at 37.5 months (\(\simeq\) 3.1-year). Both lags are comparable with the results found with the cross-correlation analysis in this work, but also in agreement with the findings by Reda et al. (2023b). The latter result highlights that the correlation found is not simply due to the synchronization of the time series peaks, but it means that there is a predictive link between Ca II K index and the solar wind dynamic pressure, suggesting a certain degree of causation.

On the other hand, the information flow in the reversed case (top-right panel of Fig. 10) is always below the 99% threshold with the exception of a peak at 71 months (\(\simeq\) 5.9-year; comparable with the distance between solar maximum and solar minimum). We interpret this finding as due to redundancies/periodicities induced by the solar cycle. To test quantitatively this hypothesis, in principle, Eq. (8) should be computed using \(k>1\). However, this is not reliable with only 670 data points.

The bottom panels of Fig. 10 show the results of the transfer entropy analysis for the solar wind speed. In both directions, i.e., from Ca II K index to \(\mathrm {V_{SW}}\) (bottom-left panel) and vice versa (bottom-right panel), there are no strong evidences about the information flow. The exceedances of the threshold at lags of 37 months (\(\simeq\) 3.1-year) and 60 months (\(\simeq\) 5.0-year) from Ca II K to \(\mathrm {V_{SW}}\), and at 78 months (\(\simeq\) 6.5-year) from \(\mathrm {V_{SW}}\) to Ca II K, are here interpreted as a fluctuation due to sampling effects. However, the peak at 3.1-year is consistent with the structures found in the analysis of solar wind dynamic pressure and with the results recently found in Reda et al. (2023b).

Note that, in general, the results of the transfer entropy analysis are noisy. This is due to the fact that we have only 670 data points, so that the estimation of high-dimensional transition probabilities is prone to fluctuations. To mitigate this effect, the estimation of transition probabilities must be performed by reducing the number of bins. In our case, the best trade-off between correct sampling and resolution (i.e., to have filled bins) is to choose less than 10 bins per dimension. Aware of this technical problem, we interpret threshold exceedances of single-point structures as fluctuations that are not statistically significant.

Fig. 10
figure 10

Transfer entropy from Ca II K index to solar wind dynamic pressure and speed (top- and bottom-left panels, respectively) and vice versa (top- and bottom-right panels, respectively), computed using monthly averaged data. The purple line marks the 99% significance threshold obtained from the analysis of surrogate time series data (color figure online)

5 Discussion and conclusions

Starting from the monthly averages of a physical proxy of the solar activity (the Ca II K index) and solar wind parameters, we investigate in this work their relationship on Space Climate scales. To this scope, we take advantage of the Hilbert–Huang Transform. This method allows to decompose the starting signals into several modes and to obtain for each of them the instantaneous frequency and hence the mean characteristic time scale. Looking at how the energy is distributed among the time scales, we find a quite similar behaviour for all the signals between annual and solar cycle scales, characterized by an increasing power. Concerning the behaviour at lower and higher time scales, instead, it is not possible to draw conclusions here.

The advantage of the HHT is that the EMD makes possible to filter out the noisy high-frequency components, which are not of interest for the purpose of this work. The time lags we find between Ca II K index and both solar wind speed (3.1-year) and dynamic pressure (3.4-year), after subtracting the contribution at scales smaller than 3 years, are consistent with the results previously obtained by Reda et al. (2023b) using a 37-month smoothing on the same dataset.

However, the presence of a correlation with a time delay does not ensure a cause–effect relation, as well as the presence of mutual information peak does not guarantee the nature of the directional coupling. For this reason, to further investigate these results, we use the transfer entropy as predictive causality test for higher order correlation between data. The results from transfer entropy analysis suggest the presence of statistically significant structures from Ca II K index to solar wind dynamic pressure, with a peak at time lag of 3.6-year, once again in agreement with the time lag found by Reda et al. (2023b). This finding confirms that the knowledge of past values of the Ca II K index gives information about the future state of the solar wind dynamic pressure. Indeed, the cross-correlation analysis is based on co-variation of data and it is naturally stronger when peaks are synchronized, while the transfer entropy is based on (temporal) transition probabilities between the states and it is a dynamical and time-asymmetric concept. As reported by Wibral et al. (2013), the time-delays found from cross-correlation analysis and transfer entropy analysis are not necessarily the same.

Considering the information flow from Ca II K index to solar wind speed, the transfer entropy shows a single peak that exceeds the 99% threshold, at time lag of 3.1-year. As in the case of the dynamic pressure, this results is consistent with the result recently found by Reda et al. (2023b). However, because it is a single peak, it may also be interpreted as a fluctuation.

Our results suggest that over the last five solar cycles, there is a better information flow from Ca II K index to solar wind dynamic pressure than from Ca II K index to solar wind speed. Since the dynamic pressure depends both on the speed and the density of the solar wind, thus making it an energy related parameter, we interpret this result as a phenomenon connected to energy transfer processes from the Sun to the heliosphere. The former result could be of interest to build up a predictive model in a Space Climate context.

We remark that, although the results found with the transfer entropy analysis are in agreement with the previous findings, further work is needed to assess the causal relationship. Indeed, due to technical limitations (e.g., few data points), we are not able to investigate thoroughly the statistical significance of our results. However, we emphasize that, at the moment, the application of the transfer entropy is promising and may be extremely helpful in the future to disentangle the (non-linear) causal relations between the solar activity and the solar wind at Space Climate scales. Dataset with a higher time resolution, thus with more data points, will be considered for a future analysis to verify this hypothesis and to confirm the result obtained via the transfer entropy presented in this work.