1 Introduction

The solar surface covers a mix of open and closed magnetic field structures. Predominantly “open” magnetic fields, i.e. closing far in interplanetary space, are observed within coronal-hole (CH) regions. In extreme-ultraviolet (EUV) radiation, these regions appear as dark areas having a reduced density and temperature. Along the open magnetic field, fast solar-wind streams are emanating into interplanetary space and with that, CHs are the sources of high-speed solar-wind streams (HSSs). The interaction between the fast and the surrounding slow solar-wind generates at the western part of the CH large-scale compression structures, so-called stream interaction regions (SIRs). The stream interfaces consist of a turbulent and enhanced magnetic field, and increased density that is followed by the high-speed solar-wind stream (Gosling and Pizzo, 1999). At the stream interface a forward pressure wave is generated followed by the trailing edge representing a reverse pressure wave (for more details see e.g., the review by Cranmer, Gibson, and Riley, 2017). When observed for more than one solar rotation, SIRs are called co-rotating interaction regions (CIRs). Complementary to coronal mass ejections, generating the most strong geomagnetic storms, HSSs/SIRs/CIRs cause minor to moderate geomagnetic activity (e.g., Baker et al., 1981). As CHs are long-lived structures, CIRs and their following high-speed streams may cause recurrent geomagnetic effects on Earth and other planets in our solar system with a periodicity of approximately 27 days. The total energy input of CIRs when sweeping over the Earth’s magnetosphere might be comparable to a single CME event, and therefore CIRs are of interest when studying solar-terrestrial physics (Tsurutani et al., 2006).

From statistical analyses it is found that the area of low-density CH regions is well correlated with the solar-wind speed measured at Earth (e.g., Nolte et al., 1976). This might be explained by the magnetic-field flux tube expansion, where the amount of open field that is distributed over a certain area relates to the acceleration of the solar wind (Wang and Sheeley, 1990), or it is a result of the propagation phase of the HSS from Sun to Earth and its interaction with the SIR (Hofmeister et al., 2020). After a CH passes the central region of the solar disk, on average about 4 days later (Vršnak, Temmer, and Veronig, 2007; Rotter et al., 2012), a CIR and HSS may be observed. The CH area is extracted from EUV image data using simple thresholding and the established CH area-speed relation can be used as an empirical model to forecast solar-wind speeds at 1 AU (Vršnak, Temmer, and Veronig, 2007). In the past years, the area-speed relation was revised in several studies and the so-called “empirical solar wind forecasting” model (ESWF) could be further developed (Rotter et al., 2012, 2015; Reiss et al., 2016). Currently, the ESWF exists in its version 2.0 and is an ESA space weather service provided by University of Graz within the Heliospheric Weather Expert Service Center (H-ESC).Footnote 1

Recent studies revealed that there is room for improvement. Strong relations were found between the rise/fall of the in-situ measured high-speed solar-wind streams and the width of a coronal-hole (Garton, Gallagher, and Murray, 2018a). Superposed epoch analysis has shown that the median time-velocity profile of a CIR consists of a sharp increase followed by longer recovery phase (Geyer et al., 2021). Differing propagation times of the different streams have not yet been implemented in the model. Indeed, from real-time monitoring of the ESWF service and post-event analysis, it was recognized that the modeled solar-wind speed profile is too gradual as compression and rarefaction effects are not considered by the smooth increase and decrease in the observed evolution of the coronal-hole.

ESWF 2.0 uses a constant threshold of 35% of the median disk intensity to identify the boundaries of CHs. Other algorithms are available as well for CH detection such as, e.g., CHIMERA, which uses multiple wavelength data (Garton, Gallagher, and Murray, 2018b), the coronal hole image segmentation code module (eszeg: Caplan, Downs, and Linker, 2016), which is based on a region growing algorithm, or CHRONNOS, which is an automatic CH detection algorithm using convolutional neural networks on multi-band EUV imagery (Jarolim et al., 2021). In addition, within the ESWF EUV image data processing pipeline a dynamic thresholding algorithm has been developed using the CATCH tool (Heinemann et al., 2019).

In general, the improvement of ESWF can be measured against results from statistical studies. Reiss et al. (2016) compared the uncertainties in the arrival time and speed between ESWF 2.0 and the semi-empirical Wang-Sheeley-Arge model (WSA; Wang and Sheeley, 1990; Arge and Pizzo, 2000; Arge et al., 2003) and found typical uncertainties in the range of ± 48 hours and ± 100 km s−1. From a variation of persistence solar-wind models (assuming that the stream does not change in time; see e.g., Owens et al., 2008; Temmer, Hinterreiter, and Reiss, 2018), the uncertainties are within a similar range of 80 – 100 km s−1 and 1 – 2 days off. Recent statistics assessing the performance of numerical magnetohydrodynamic (MHD) models, such as EUHFORIA by Hinterreiter et al. (2019), derived an RMSE of 125 km s−1. Other papers investigating the solar-wind model performance report speed uncertainties in the range of 100 – 150 km s−1 and for the arrival time up to 3 days (MacNeice, 2009; Jian et al., 2015). Moreover, we also know that the performance of MHD models decreases significantly for periods of high solar activity (Gressl et al., 2014). In that respect, we also will have a closer look at the ESWF performance stability of the new developments and improvements over the solar cycle.

The investigation of the solar-wind flow is essential when studying space weather, as the background solar-wind is the backbone of CME propagation models. The improvement of space weather related models is of high interest and needs basic research combined with applications (e.g., Temmer, 2021). The accurate modeling of the solar-wind flow in interplanetary space is identified as a crucial parameter for CME propagation models, e.g., in the COSPAR Roadmap (Schrijver et al., 2015). Though many studies were outlined on that subject, there was no clear improvement and, hence, still there is a requirement of a more accurate solar-wind flow model, as mentioned in the 2023 COSPAR Roadmap update under the iSWAT initiative.

2 Methods

ESWF is based on the ratio of CH areas that appear within a slice of ±7.5 degrees around the central region of the Sun (depicted in Figure 1). Using the empirical CH area – SW speed relation (Vršnak, Temmer, and Veronig, 2007), the ratio of CH areas can be related to the solar-wind speed measured at 1 AU about four days later. As the CH rotates with the Sun, the fractional area begins to rise and reaches its peak approximately when the center of mass of the CH is on the central meridian. This process then reverses as the CH moves out of the slice again. The larger the area detected within that slice, the higher the solar-wind speed forecast. The basic ESWF version is calculated by

$$ v(t) = c_{0} + c_{1} A(t-t^{*}), \ $$
(1)

where \(v\) is the solar-wind velocity for the distance at 1 AU, \(c_{0}\) and \(c_{1}\) (in units of km s−1) are the coefficients of the linear relation, \(A\) is the fractional CH area (relative CH area measured within the meridional slice), and \(t-t^{*}\) is the retarded time with the time-lag, \(t^{*}\), in days.

Figure 1
figure 1

Cartoon showing the symmetric (Gaussian-like) area-time relation for an idealized CH rotating into the measurement slice at the central part of the Sun and exiting it again. As the CH moves to the central position, the maximum area, hence, forecasted solar-wind speed at 1 AU is reached. The red line at the Western part illustrates the maximum compression, that is the stream interface region. The compression and rarefaction are not covered by the symmetric area profile as generated by ESWF 2.0.

Since the establishment of the empirical CH area – SW speed relation, different development steps were performed. ESWF 1.0 and 2.0 assume a constant time lag of four days (\(t^{*} = 4\)d). With ESWF 2.0, \(c_{0}\) and \(c_{1}\) have been recalculated to derive an adaptive method for the varying low speed component of the background solar-wind, as described in Rotter et al. (2015) and also used in Reiss et al. (2016, their Equation 2).

With these developments, the ESWF versions 1.0 and 2.0 reveal a rather symmetric profile in the solar-wind speed forecast (see Figure 1), leading to delayed peak forecasts. To improve, compression and rarefaction effects due to the interaction between slow and high-speed solar-wind streams appearing on the West part of the CH need to be considered. In addition, we aim to implement recent results derived from advanced analyses of CHs, such as: a) dynamic intensity thresholds for the CH segmentation (Heinemann et al., 2019) and b) the dependence of the HSS peak velocity to the co-latitude of the CH location (Hofmeister et al., 2018) – both useful for better covering solar cycle variations.

In the following, we present for each aspect: 1) compression and rarefaction, 2) dynamic thresholding, and 3) co-latitude, the development steps performed and summarize here the different versions of the ESWF, which will be described.

  • ESWF 1.0 – constant time lag of 4 days; constant \(c_{0}\) and \(c_{1}\); described in Vršnak, Temmer, and Veronig (2007);

  • ESWF 2.0 – constant time lag of 4 days; variable \(c_{0}\) and \(c_{1}\); described in Rotter et al. (2015), Reiss et al. (2016);

  • ESWF 3.0 – PROP; adds non-constant time lag to mimic compression effects; each plasma parcel propagates to 1 AU with the predicted velocity; variable \(c_{0}\) and \(c_{1}\);

  • ESWF 3.1 – PILE; similar to PROP adding non-constant time lag to mimic compression effects; each plasma parcel propagates to 1 AU with the mean value of the predicted velocity and a constant background velocity of 400 km s−1; variable \(c_{0}\) and \(c_{1}\);

  • ESWF 3.2 – PILE; CH area extraction with dynamic thresholding; co-latitude information; variable \(c_{0}\) and \(c_{1}\).

2.1 Compression and Rarefaction

The Gaussian profile, as illustrated in Figure 1, shows the symmetric speed profile stemming from the CH entering and exiting the slice over which the ratio of the area is calculated. This, however, is not capable of mimicking compression effects. To better match the steepening of the HSS as it encounters the slow solar-wind flow ahead, two different algorithms are developed. The propagation-algorithm (PROP) takes each data point and predicts its arrival time based on its velocity from Equation 1 and the Sun-Earth distance. These data are then shifted to the calculated arrival time. The time-lag, \(t^{*}\), is now calculated as

$$ t^{*}_{\mathrm{PROP}}= \frac{1.5 \times10^{8}~[\text{km]}}{c_{0} + c_{1} A(t)}. $$
(2)

In order to avoid plasma parcels overtaking others, the affected points are compressed using a linear scale. Using the PROP algorithm, the stream interaction is only modeled in the way that faster streams displace slower ones in the time series.

The pile-up-algorithm (PILE) works in the same way, but the arrival time is calculated slightly differently, taking into account a simple model for the stream interaction. Instead of solely basing the propagation on the velocity value of each individual data point, the PILE algorithm takes the mean value of the data point and the background solar wind, given by

$$ t^{*}_{\mathrm{PILE}}= \frac{1.5 \times10^{8}~[\text{km}]}{((c_{0} + c_{1} A(t))+400~[\text{km}\,\text{s}^{-1}])/2}. $$
(3)

The background solar-wind speed is assumed to be constant at 400 km s−1, which corresponds roughly to the mean background solar-wind speed in our time interval of interest (2012 – 2021). The assumption is that an HSS engages in an inelastic collision with the surrounding matter, conserving momentum (see also Hofmeister et al., 2022). The simple calculation of the mean implies another assumption, namely that the density distribution close to the Sun is uniform.

In Figure 2 we show the resulting synthetic velocity profiles for the different algorithms, which represent each a typical profile produced by the ESWF having a background velocity (\(c_{0}\)) of 350 km s−1. We see that indeed, in comparison to the Gaussian profile, both algorithms produce an asymmetrical profile, like we would expect from an average HSS (Geyer et al., 2021; Dumbović et al., 2022). The PROP algorithm produces a very steep rise of the curve and also shifts the peak to earlier arrival times than the PILE algorithm. Although producing a similar rise, the PILE algorithm is more gentle, being slowed down by the slow solar-wind ahead and thus mimicking the interaction region between the streams.

Figure 2
figure 2

Different algorithms for modeling the propagation of the solar-wind to 1 AU. The PROP algorithm assumes a displacement of slow solar wind parcels by the fast ones, while the PILE algorithm works with piling up streams of constant density. For more details, see text.

2.2 Dynamic Thresholding

Using image data (193 Å) from the Atmospheric Imaging Assembly (AIA: Lemen et al., 2012) on the Solar Dynamics Observatory (SDO), Heinemann et al. (2019) found that, for optimal extraction of CH areas, the threshold intensity is dependent on the phase of the solar cycle. Thus, a constant threshold intensity as a percentage of the median intensity of the solar disk is not suitable for longer periods. Whereas the ESWF 2.0 still uses 35% of the median intensity as a constant threshold of the solar disk, we here introduce the dynamic thresholding technique derived from the statistical analysis of the CATCH coronal-hole catalog (Heinemann et al., 2019):

$$ \mathrm{TH = 0.29 \times I_{m} + 11.53~[DN] \, }, $$
(4)

where \(I_{\mathrm{m}}\) is the median intensity and TH is the threshold intensity in DN (Data Number).

2.3 Co-latitude Dependence

Hofmeister et al. (2018) derived, from a data set of well-observed low-latitude CHs during the period 2010 – 2017, a relation between the maximum solar-wind speed measured on Earth and the co-latitude of the center of mass of the CH and its area. To implement that, the meridional slice (±7.5°) is cut horizontally into 12 evenly distributed segments before CH area extraction, each covering a latitudinal range of 15. The left panel of Figure 3 exemplarily shows an EUV image recorded by the AIA/SDO 193 Å filter, along with the meridional slice divided into 12 latitudinal segments. Each segment is assigned a fractional CH area with respect to the whole slice area. Then these values are given weights according to the co-latitude to Earth. Above a co-latitude of \(60^{\circ}\) (in both hemispheres), the weights become zero. At the other ranges, they are weighted with a function depending on the cosine of the co-latitude. The right panel in Figure 3 shows the weights that are applied to the fractional CH area, which is shown in the middle panel of Figure 3. The EUV image was taken on February 12, 2015, at which the B0 angle was negative with −6.7°, meaning the northern hemisphere of the Sun was tilted away from the observer. Thus, close to B0, that is for the lower latitudes of the Southern Hemisphere, the CH areas are weighted more.

Figure 3
figure 3

Illustration of CH area segmentation and weighting by co-latitude. Left: SDO/AIA EUV image at 193 Å showing the meridional slice and the 12 latitudinal segments. Middle: fractional CH area derived for each segment. Right: statistical weights to be applied to each segment.

3 Statistical Analysis

To test the various new implementations and resulting ESWF versions 3.0 – 3.2, we use a statistical sample of 294 well-observed high-speed streams over the time range 2012 – 2021. The list of HSSs was manually selected, compiled and investigated by Geyer et al. (2021) with enhancements by Koller et al. (2022). The given times were extracted from the solar-wind measurements such that each HSS is embedded in a time period of seven days, with the velocity peak appearing 3 days after the start of that time period.

For the comparison between different ESWF versions, we focus on the parameters arrival time and arrival speed of the HSS at 1 AU. We perform hit & miss statistics where we use the following criteria. As a hit we define that the modeled HSS peak speed is within a range of ± 100 km s−1 and that the arrival time of that peak is within a range of ± 48 hours. In addition, we also define a separate velocity and time hit, which gives us information whether one parameter performs better than the other. Modeled profiles that do not meet these criteria are defined as miss. These criteria are consistent with RMSE values derived from earlier studies including MHD models (Owens et al., 2008; MacNeice, 2009). As we compare a specific sample of data with verified events of HSSs, we do not cover false positives or false negatives. In that respect, we investigate the improvement of the hit statistics for the different model development steps applying the specified value ranges. Furthermore, we calculate for the arrival time and speed the mean absolute error (MAE) given by

$$ \mathrm{MAE} = \frac{1}{n}\sum_{i=1}^{n} |v_{i}^{\mathrm {{model}}}-v_{i}^{\mathrm{{ref}}}| $$

and the root mean squared error (RMSE) given by

$$ \mathrm{RMSE} = \sqrt{ \frac{1}{n} \sum_{i=1}^{n} (v_{i}^{\mathrm {{model}}}-v_{i}^{ \mathrm{{ref}}})^{2} }. $$

Here \(n\) is the amount of data points, \(v_{i}^{\mathrm{{model}}}\) is the \(i\)th HSS peak velocity extracted from the model, and \(v_{i}^{\mathrm{{ref}}}\) is the \(i\)th HSS peak velocity from the in-situ data. For estimating the match of the entire modeled HSS profile, we derive the Pearson correlation coefficient (median value of the 294 correlation coefficients derived) and the Euclidean distance (sum of differences between each point of a time series, normalized for one week; median value for the 294 time series). The aforementioned statistical tools were chosen with the intent to be comparable to studies of different models and previous studies on the ESWF (see, e.g. Rotter et al., 2015; Reiss et al., 2016). In addition, we make use of dynamic time warping (DTW). Recently the usage of DTW in quantifying the performance of solar-wind models has been explored (Samara et al., 2022a). Here, we use DTW in a simple form, by calculating the normalized DTW cost for each HSS and then comparing the median values. DTW is especially useful for our analysis, since one of the main objectives is to properly model the shape of an HSS, and DTW can give us information about the similarity of two curves. The DTW cost matrix is defined as

$$ D_{ij}(v_{\mathrm{{Model}}}, v_{\mathrm{{Ref}}}) = d_{ij} + \mathrm {min}(D_{i-1,j-1}, D_{i,j-1}, D_{i-1,j}) $$

with \(d_{ij}\) the Euclidean distance, which is defined as

$$ d_{ij}(v_{\mathrm{{Model}}}, v_{\mathrm{{Ref}}}) = \sqrt{ (v_{\mathrm{{Model}}}(i) - v_{ \mathrm{{Ref}}}(j))^{2}+(i-j)^{2} }. $$

After constructing an array containing the cost between each two elements of the time series we find the path with the minimum Euclidean distances between adjacent elements. The sum of all Euclidean distances along the optimal path is then normalized by the amount of data points per time series. For implementation we use the python package openly provided by Giorgino (2009); see also Niennattrakul, Ruengronghirunya, and Ratanamahatana (2009) and Samara et al. (2022a).

In addition to performing these methods on the entire data set (2012 – 2021), we split the data set into two subsamples, considering periods of high (110 HSSs) and low solar activity (184 HSSs) separately. The year 2016 is used as separation time between high and low activity and reveals half the peak-to-peak difference between the Solar Cycle 24 maximum phase and the minimum towards the maximum phase of Solar Cycle 25.

4 Results

We performed the statistical analysis described in Section 3 on the output of each ESWF version (1.0, 2.0, 3.0, 3.1, 3.2). The main results are shown in Figure 4 and summarized in Table 1.

Figure 4
figure 4

Statistics for all versions of the ESWF. Red columns represent previous versions, while purple ones show the versions first introduced here. (a) The percentage of hits for each version of the ESWF. (b) The RMSE for peak velocities (left axis) and timing (right axis). (c) The MAE for peak velocities (left axis) and timing (right axis). (d) The Pearson correlation coefficient [cc] within 1 week of each HSS. (e) The median Euclidean distance within 1 week of each HSS. (f) The median value of the normalized DTW-cost within 1 week of each HSS.

Table 1 Performance of two different versions of the ESWF for the entire sample of HSSs (full) and separately for the events occurring prior to 2016 (Min) and after 2016 (Max). The hit statistics use the criterion that the modeled HSS peak velocity is within ±100 km s−1 and the HSS peak arrival time is within 48 hours of the observed ones. In addition, we give separate statistics for either a hit in terms of peak velocity or arrival (two bottom columns).

Panel (a) of Figure 4 shows the percentage of HSS that meet the criteria of a hit in the forecast in terms of the peak velocities and arrival times. Panels (b) and (c) show the RMSE and the MAE for the HSS arrival times and peak velocities, respectively. While the hit percentage increases, the errors steadily decrease as the versions improve. Panel (d) shows the correlation coefficient between model and measurements for the different ESWF versions. The only significant positive difference occurs in versions with the PILE-algorithm, whereas the ESWF 3.1 performs slightly better than 3.0. Panel (e) shows the median Euclidean distance for each version of the ESWF during a HSS. We see that the distance decreases for every consecutive version, which can be interpreted as improvements. Panel (f) shows the median normalized DTW-score for each version. For each HSS, the DTW-score/cost has been calculated and normalized by the number of data points, and the median of these values is plotted. Since the DTW cost is an inverse measure of the similarity of two time series, lower cost means higher similarity. Note that these numbers are only to be interpreted relative to each other. One DTW score alone gives no clue about the goodness of a model. We see that the similarity improves for most consecutive versions and is best for the ESWF 3.2.

To better understand the improvements as derived from the point-to-point statistics, we inspect three time series for which the HSS profiles are modeled by the ESWF 2.0 and 3.2. Figure 5 shows the in-situ measured solar-wind speed using OMNI data for periods during three different solar cycle phases, namely (a) the solar maximum phase in 2012, (b) the declining phase in 2016, and (c) the minimum phase in 2019, together with the model results (top panel ESWF 2.0 and bottom panel ESWF 3.2). Note that only the isolated HSSs have been used for the statistical analysis. The HSSs are marked by gray shaded areas. We first notice that the occurrence frequency of HSSs increases as the solar cycle is in its declining phase during 2016. For all three phases, we observe a qualitative improvement of ESWF 3.2 compared to ESWF 2.0, especially during the declining phase in panel (b) of Figure 5. During the maximum phase, in panel (a) of Figure 5, we see two distinct HSSs that previously have been detected by the model, but their profiles don’t match very well with the observed data. One of those structures is in early June, the other in early July. The first is modeled quite accurately with ESWF 3.2. The second one is slightly off, but still matches better the observed profile in comparison to ESWF 2.0. In panel (b) in Figure 5 we can identify six different HSSs during the declining phase of the solar cycle. For almost each of those HSS we see that the model has improved in both timing and peak velocity. During the solar minimum, we see in panel (c) that the forecast of larger structures slightly improved, while some smaller ones have even worsened. It is difficult to qualitatively assess the change in performance for low solar activity. Overall we can see an improvement throughout the solar cycle. Table 1 summarizes all the statistical results for the entire sample of HSSs over the whole solar cycle, as well as for the minimum and the maximum phases separately. In addition, we calculate the statistics for either a velocity hit or a time hit separately. Interestingly, we derive that the time hits perform better in comparison to the velocity hits, which on average underestimate the peak velocity.

Figure 5
figure 5

Selected time intervals during (a) the maximum, (b) the declining phase, (c) the minimum of Solar Cycle 24. We show the solar-wind speed data from OMNI (black curves) together with the ESWF 2.0 (red) and ESWF 3.2 (purple) model results. HSSs are marked by gray shaded areas.

5 Discussion

Over the past years, we made substantial progress and increased our knowledge about the evolution of CHs, their specific shapes and locations on the Sun, that are clearly affecting the in-situ measured solar-wind speed profile (e.g., Garton, Gallagher, and Murray, 2018a; Heinemann et al., 2018; Hofmeister et al., 2018, 2020, 2022; Samara et al., 2022b). The ESWF model (Vršnak, Temmer, and Veronig, 2007; Rotter et al., 2012, 2015; Reiss et al., 2016) is a simple but robust forecast algorithm, relating CH areas to the in-situ measured HSS on Earth with an average lead time of 4 days. Specifically, here we are interested in improving the ESWF model towards more reliable forecasts of the arrival time and speed of individual HSSs.

For that we newly implemented compression and rarefaction effects. We developed two algorithms, PILE and PROP. Both are simplified approaches assuming a uniform density distribution between HSSs and slow wind close to the Sun. PROP (ESWF 3.0) represents a simple displacement between the streams, which is physically incorrect (see also Hofmeister et al., 2022). PILE (ESWF 3.1) reflects a ballistic propagation of the streams with inelastic interactions for which the background solar-wind is considered as a constant (derived with ∼400 km s−1) and a time-variant speed variable (\(c_{0}\) in Equation 3) accounting for short-term variations is added. Though the hit statistics are slightly better compared to ESWF 2.0, PROP does not cover much interaction between fast and slow solar-wind and, hence, it causes similar early arrival times as ESWF 2.0 with an MAE of more than 40 hours (see Figure 4). The results for PILE reveal in the MAE and correlation coefficients clear improvements over ESWF 2.0 and therefore PILE was used as basis to further develop ESWF 3.2, where the co-latitude information and dynamic coronal-hole extraction threshold was implemented. The improvements come at a low computing cost as the algorithm in total takes less than two minutes to compute.

For the comparison between ESWF 3.2 and ESWF 2.0, we use hit & miss statistics on the HSS sample with the following criteria. A hit was considered when both, modeled and observed HSS speed maximum, occurred within a 48-hour window and the maximum speed of the modeled and observed HSS matched by ± 100 km s−1. We derive a relative increase of hits by 13.9%. Note that the amount of separate hits in timing increase more than those in velocity. In addition, we calculate for the arrival time and speed the MAE and RMSE for the HSS sample and find improvements up to 25%. In order to quantify the match of the entire modeled HSS profile within a time window of one week, we calculate the Pearson correlation coefficient, the Euclidean distance, and DTW cost. For the Pearson correlation we obtain a relative increase of 14.3%. We note here that the DTW cost is very sensitive to the boundary values and the length of the time series (here 7 days). It can only be used as a relative comparison between two time series. The decrease of the DTW cost found from ESWF 2.0 to 3.2 shows the improvement in terms of the increased similarity between modeled and observed HSS profile when using ESWF 3.2. For more details we refer to the study by Samara et al. (2022a).

Table 4 in Reiss et al. (2016) shows a comparison of two different solar-wind models, ESWF 2.0 and Wang-Sheeley-Arge (WSA: Wang and Sheeley, 1990; Arge et al., 2003), and their performance of simulating high-speed enhancements (HSEs). HSEs were defined by speed increases above a minimum threshold value for the background solar-wind. For the time range of 2011 – 2014 they found a Pearson correlation coefficient of 0.39 – 0.42 for ESWF 2.0 and 0.37 – 0.50 for WSA, depending on the definition of the threshold for HSE detection. These results are comparable to our results for the ESWF 3.2 during 2012 – 2021 (Table 1), giving a Pearson correlation coefficient of 0.31 and 0.46 for the HSS list during maximum and minimum solar activity, respectively. The better performance for low solar activity phases is derived also for numerical background solar-wind models and can be explained by the low number of coronal mass ejections propagating in interplanetary space (see e.g., Owens et al., 2008; Gressl et al., 2014).

Studies show that the HSS speed measured at Earth, seems also to be affected by the actual morphology of a coronal-hole (e.g., Bromage et al., 2000; Samara et al., 2022b). In-situ measurements close to the Sun derived with the Parker Solar Probe hint towards a mixture of open and closed field from within coronal holes (Bale et al., 2019). This supports the finding that the predominantly open magnetic field stems from flux tubes covering only a small fraction of a coronal-hole’s area (Hofmeister et al., 2017). Fine structures within the coronal-hole boundaries are not covered by ESWF. Definitely, more investigations are needed with respect to coronal-hole areas/boundaries and how these appear as an entity of a dark region.

6 Conclusion

The latest version of the empirical solar wind forecast model has the following new adaptations:

  • Dynamic Thresholding: CH area extraction using a threshold, which takes into account solar activity.

  • Co-latitude Information: segmentation of CH areas into different latitudes to account for the three-dimensionality of the solar-wind propagation.

  • PILE Algorithm: simple stream interaction model using constant density distribution and inelastic collision of streams.

Over the whole Solar Cycle 24, we see an improvement of the ESWF 3.2 over ESWF 2.0 in terms of:

  • Number of correctly forecasted HSS Peaks: 13.9% relative increase

  • Pearson correlation: 14.3% relative increase

Furthermore, we find the following mean deviations from the in-situ data:

  • Time MAE: 37.6 h

  • Velocity MAE: 91.1 km s−1

We derive that the prediction is usually about 1.5 days and 90 km s−1 off. These values can roughly be used as uncertainties of the model ESWF version 3.2.

Further improvement of the ESWF is needed to more accurately predict space weather conditions, especially considering the current performance for the modeled peak velocity. The largest improvement step in that regard has been achieved between ESWF versions 1.0 and 2.0, whereas the presented newer versions mainly improved the timing of the peaks (compression effect). Methods for further improvements include data assimilation from in-situ measurements, which was already shown by Podladchikova et al. (2018), and which is another ESA service called ESWF24. The data assimilation however, strongly reduces the lead time (for ESWF24 to 24 hours). Other options for better forecasting are always given by machine-learning algorithms. In that respect, the morphology of CHs and in-situ measured speed profiles could be related in more detail (see e.g., Samara et al., 2022b) for enhancing the forecast quality.