Towards Construction of a Solar Wind “Reanalysis” Dataset: Application to the First Perihelion Pass of Parker Solar Probe

Accurate reconstruction of global solar-wind structure is essential for connecting remote and in situ observations of solar plasma, and hence understanding formation and release of solar wind. Information can routinely be obtained from photospheric magnetograms, via coronal and solar-wind modelling, and directly from in situ observations, typically at large heliocentric distances (most commonly near 1 AU). Magnetogram-constrained modelling has the benefit of reconstructing global solar-wind structure, but with relatively large spatial and/or temporal errors. In situ observations, on the other hand, make accurate temporal measurements of solar-wind structure, but are highly localised. We here use a data assimilative (DA) approach to combine these two sources of information as a first step towards producing a solar-wind “reanalysis” dataset that optimally combines model and observation. The physics of solar wind stream interaction is used to extrapolate in heliocentric distance, while the assumption of steady-state solar-wind structure enables extrapolation in longitude. The major challenge is extrapolating in latitude. Using solar-wind speed during the interval of the first perihelion pass of Parker Solar Probe (PSP) in November 2018 as a test bed, we investigate two approaches. The first is to assume the solar wind is two-dimensional and thus has no latitudinal structure within the ±7∘\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}${\pm}\, 7^{\circ }$\end{document} bounded by the heliographic equatorial and ecliptic planes. The second assumes in situ solar-wind observations are representative of some (small) latitudinal range. We show how observations of the inner heliosphere, such as will be provided by PSP, can be exploited to constrain the latitudinal representivity of solar-wind observations to improve future solar-wind reconstruction and space-weather forecasting.


Introduction
Scientific progress in climate and meteorological science has been revolutionised by "reanalysis" datasets that merge state-of-the-art physics-based numerical models with available observations through robust data assimilation (e.g., Dee et al., 2011). Reanalysis provides a means to combine disparate observations with different uncertainty characteristics, as well as enabling interpolation between the available observations using the known physics of the system. More generally, it attempts to merge the global-but-inaccurate information of numerical models with the accurate-but-local information of observations. This approach is beginning to be adopted by space physics, particularly in the radiation belts (Aseev and Shprits, 2019;Glauert, Horne, and Meredith, 2018) where there are both strong physical constraints that can be used to reduce the dimensionality of the system and reasonable spatial sampling with in situ observations. A reanalysis dataset for the solar wind would be invaluable for connecting remote solar observations with in situ solar-wind observations. This is a key goal of the Parker Solar Probe (PSP, Fox et al., 2016) and Solar Orbiter (Müller et al., 2013) missions, which seek to understand how the solar wind is formed.
Recently, we have demonstrated some of the challenges to routine assimilation of in situ solar-wind observations with three-dimensional magnetohydrodynamic (MHD) models of the solar wind (Lang et al., 2017). In addition to the sparse spatial sampling and highly localised information provided by in situ observations, the supersonic solar-wind flow means any local improvements to the model state imposed by, e.g. near-Earth observations, are quickly lost from the system as they are never propagated to < 1 AU. In order to make a lasting improvement to the model state, it is necessary to update the solar-wind conditions at the inner boundary of the model (typically set at 20 -30 solar radii, R ). One means of achieving this is through a simplified model of solar-wind propagation which enables in situ observations to be mapped back to 30 R , where they can be combined with the model inner boundary conditions . At present this variational data assimilation technique has been limited to solar-wind radial speed (V R ), but can in principle be expanded to all modelled and observed physical parameters.
In order to demonstrate the potential of this methodology for creating a solar-wind reanalysis, we here reconstruct solar-wind speed during the first PSP perihelion pass in November 2018. There are a number of reasons to select this period. Firstly, as PSP's goal is to understand solar-wind acceleration and release, particularly the source of the slow solar wind, there is a great deal of scientific interest in this interval. Secondly, this interval has already received a good deal of scrutiny from the solar-wind modelling community, with both detailed time series forecasts for the specific period of interest van der Holst et al., 2019) and more climatological forecasts for inner heliosphere at solar minimum (Chhiber et al., 2019;Venzmer and Bothmer, 2018). In general, the available models produce very good agreement with the 1-AU observations for this interval, though there is still some scope for further improvement. Thirdly, at the time of writing, PSP solar-wind speed observations are still unknown and so cannot be (knowingly or otherwise) tuned to provide agreement. They thus provide a truly "blind" test of the technique.
Section 2 summarises the available in situ V R observations at 1 AU, while Section 3 outlines the global solar-wind structure inferred from MHD modelling of the observed photospheric magnetic field. We then take two approaches to data assimilation (DA) in order to combine these observations. In Section 4, we proceed by assuming the solar wind (at least bounded by the ecliptic and heliographic equatorial planes) can be considered a twodimensional, steady-state system. In Section 5 we then relax the two-dimensional assumption and attempt to extract and combine latitudinal information from the available in situ observations and solar-wind models. As the latitudinal covariance of the solar-wind observations is not presently well constrained, we are forced to take a somewhat ad hoc approach. In Section 6 we demonstrate that well-separated solar-wind observations, such as will be provided by PSP, will potentially be able to constrain the latitudinal representivity of in situ observations and thus improve solar-wind DA for both reanalysis and space-weather forecasting. This is vital for space-weather forecasting from an L5 observing platform, which will routinely be offset from Earth's latitude by up to 7 • and therefore experience different solar-wind conditions Thomas et al., 2018).

In situ 1-AU Solar Wind Observations
In this study, we use in situ solar-wind radial speed (V R ) observations from two heliospheric positions; at the L1 point just upstream of Earth using data from the Deep Space Climate Observatory (DSCOVR) spacecraft as part of the Omni dataset (King and Papitashvili, 2005) and at the position of the Solar Terrestrial Relations Observatory (STEREO) Ahead spacecraft (hereafter, STA) (Kaiser, 2005), which was at a radial distance of 0.96 AU and was approximately 102 • behind Earth in its orbit during Carrington rotation 2210 (27 Oct. to 23 Nov. 2018). Hourly V R data are averaged to approximately 5-hour cadence to match the 128 equally-spaced Carrington longitudinal bins used in the MHD model described in Section 3. For convenience, in the remainder of the study we refer to the heliocentric distances of L1 and STA as 1 AU, though values are extracted from the models at the exact radial positions.
As shown in Figure 1, during CR2210 the L1 point declined from approximately 5 • to just below 2 • heliographic latitude. STA started at approximately 4 • and increased to 6.5 • latitude. PSP was located at lower latitudes throughout, particularly during the closest approach to the Sun, where it reached a minimum latitude of −4 • . Thus at the Carrington longitude of PSP perihelion, approximately 340 • , STA and L1 were at least 6 • higher latitude.
In order to better compare the solar-wind structures at L1 and STA, it is useful to consider the observations as a function of Carrington longitude, rather than time, as shown on the right-hand side of Figure 1. Any differences between L1 and STA must be either the result of the different latitudinal positions of the observations, or time evolution of the solar-wind structure between the times the observations were made (approximately 7.5 days in this instance).
At L1, there were two clear fast streams, centred at Carrington longitudes of 150 • and 230 • . In the 5-hourly data, they have peak speeds of around 580 -600 km s −1 , though the 150 • stream peaks slightly higher and persists longer than the 230 • stream. At STA, the 230 • stream is present, though it is faster and broader than at L1, and it peaks at greater longitudes. The 140 • stream is much weaker at STA than L1, and is split into two smaller streams. Inspection of the magnetic field time series suggests the short-lived fast stream on 2018-11-3 (Carrington longitude 160 • ) is a transient structure resulting from a coronal mass ejection. By contrast, the slow solar-wind structures are in good agreement between L1 and STA. We note that the PSP perihelion pass occurred primarily at Carrington longitudes of around 320 -360 • , and thus at the longitudes where L1 and STA saw exclusively slow wind.

Global Solar Wind for CR2210
Global solar-wind structure is determined using the "Magnetohydrodynamics Algorithm outside a Sphere" (MAS) global coronal and heliosphere model (Linker et al., 1999;Riley et al., 2012). MAS is constrained by HMI photospheric magnetic field observations, which are computed outward to 30 R , while self-consistently solving the plasma and magnetic field parameters on a non-uniform grid in polar coordinates, using the MHD equations and the vector potential A (where the magnetic field, B, is given by ∇ × A, such that ∇ · ∇ × A = 0 which ensures current continuity, ∇ · J = 0, is conserved to within the model's numerical accuracy). The heliospheric version of MAS then propagates solar-wind conditions out from 30 R to 1 AU. We use MAS solutions based on Carrington maps of the photospheric magnetic field and thus assume that the solar wind is completely steady state over a Carrington rotation (CR). Note that we here use the robust polytropic solution, rather than the newly developed wave-turbulence-driven model that was recently used to predict PSP perihelion conditions . We found the polytropic solution provides a better match to the L1 and STA observations and thus provides a better starting point to the DA methods described below. Figure 2 shows the MAS solution for CR2210. This is a fairly typical solar minimum solar-wind structure, with slow wind constrained to within approximately 30 • of the equator and uniform fast wind at mid and high latitudes (McComas et al., 2003). The maps on the left show a zoom-in of the solar-wind structure around the equator, with the latitudes of the L1, STA and PSP shown as red-dashed, blue-dashed and white-solid lines, respectively. Given the difference in L1 and STA latitudes and their general proximity to the largest gradients in solar-wind speed features, it is clear that even small spatial offsets potentially result in large errors in the modelled solar wind at a given heliospheric position. Despite this, the bottom-right panels of Figure 2 show that in this instance, the agreement between the modelled and observed large-scale solar-wind structure is remarkably good. The mean absolute error (MAE) between MAS and 1-AU observations is 71 km s −1 at STA latitude and 61 km s −1 at L1 latitude (cf. Owens et al., 2008). The most notable differences are that both MAS fast streams are approximately 100 km s −1 too fast at L1, the 250 • stream at L1 is offset by approximately 20 • longitude at L1 (i.e., it arrives approximately 1.5 days early), and the split 100 -170 • streams at STA are a single fast stream in MAS. This is unsurprising given the 160 • stream is a transient coronal mass ejection which would not be expected to be captured by a steady-state solar-wind solution. We also note that in general, the observed slow solar wind contains features approximately 50 -100 km s −1 faster than reproduced in MAS, where the solar-wind speed is essentially flat at 300 km s −1 outside of the fast streams.
The global MAS solution at 1 AU (bottom left panel of Figure 2) can aid in the interpretation of the differences in the observed 250 • longitude fast streams at L1 and STA. We conclude that the differences in magnitude and timing are likely to be the result of time evolution, rather than latitudinal structure. At 250 • Carrington longitude, the latitudes of L1 and STA are very similar. Furthermore, the equatorial coronal hole responsible for the fast stream is highly north-south aligned, meaning there is little variation with modest latitudinal translations. Thus any steady-state approach which assumes the solar-wind structure is constant over CR2210 will be unable to match both the STA and L1 observations at this longitude.
Interpretation of the 140 • fast stream is more complex. The equatorial coronal hole responsible is an extension from the southern hemisphere and thus L1, being at lower latitude, goes deeper into the fast wind than STA. This may partially explain the differences in observed solar wind. However, the STA observations clearly contain a non-steady-state solarwind structure at this longitude which may have disrupted the pre-existing and surrounding solar-wind stream structure (e.g., Luhmann et al., 1998).

The Two-dimensional Approximation
All solar-wind reconstructions considered in this study, including the full three-dimensional MHD MAS solution shown in Section 3, assume the solar wind is steady state and corotates with the Sun. This means that a time series at a fixed heliospheric location is equivalent to sampling Carrington longitude at a given radial distance and latitude. Consequently, over a complete Carrington rotation, a single spacecraft at 1 AU will provide complete longitudinal coverage.
In order to estimate solar-wind conditions at PSP location, observations from L1 and STA must also be extrapolated in latitude and radial distance. Doing so within the physical constraints of the system necessitates the use of a model. In this section, we begin by assuming that L1, STA and PSP are all located at the heliographic equator, allowing radial extrapolation to be considered in isolation. For a given solar-wind time series (or Carrington longitude profile) at 30 R , there is a unique solar-wind solution at 1 AU. However, for a given observed solar-wind profile at 1 AU, as provided by the L1 and STA observations, there are many possible solutions to the solar wind in the inner heliosphere, as solar-wind stream interaction tends to reduce speed gradients with distance from the Sun. Thus innerheliosphere information from the MAS model is essential for inward radial extrapolation from 1 AU.
We adopt the methodology of Lang and Owens (2019), which assimilates the L1 and STA V R observations using the MAS V R solution in conjunction with a simple two-dimensional "Upwind" representation of solar-wind stream interaction (Riley and Lionello, 2011;Owens and Riley, 2017). In brief, the MAS V R solution at 30 R is used as the initial conditions for the solar-wind reconstruction. V R values at the helioequator are mapped out from 30 R to 1 AU using the Upwind technique. Using the resulting two-dimensional solar-wind structure, the observed V R at STA and L1 is mapped back to 30 R , where it is formally as-Figure 3 L1 and STA data assimilation assuming a steady state and purely two-dimensional solar-wind structure. Top: STA (blue) and L1 (red) observations of V R at 1 AU for CR2210 as a function of Carrington longitude. The black line shows the result of assimilating L1 and STA observations into the Upwind/MAS model assuming they are both located at the heliographic equator. Middle: A colour map of V R as a function of Carrington longitude and radial distance. The PSP position for CR2210 is shown by the white line. Bottom: Reconstructed V R at 30 R as a function of Carrington longitude. similated into the MAS solution, using error covariance matrices constructed from ensemble sampling of the MAS solution . This provides the optimum solar-wind solution between the available models and observations, taking into account the uncertainty in both.
This optimum solution is shown as the black line in Figure 3, while the red and blue lines show the observed V R values at L1 and STA. (Note that the agreement with the L1 and STA observations should not be interpreted as validation of the technique as those observations have been used in the assimilation process and thus are not independent.) In general, this two-dimensional, steady-state solution lies between the STA and L1 observed V R . This is to be expected, as L1 and STA are effectively considered to be co-located observations and any difference between the two is treated as observational error. Thus the solution is worst for the two fast streams, which have both been inferred to have time-dependent aspects.
At Carrington longitudes of 300 -360 • , most pertinent to the PSP perihelion pass, the agreement between the two-dimensional reconstruction and the STA and L1 observations is extremely good. However, it should be noted that the latitudinal difference between PSP and L1/STA is greatest at this time, meaning the two-dimensional approximation is least valid. This is investigated further in Section 6.

A Three-dimensional Approach
We now attempt to relax the two-dimensional assumption. The challenge is that in situ solarwind measurements are highly localised in latitude  and that the MAS solution for CR2110 shows strong latitudinal gradients in V R near the solar equator (see  Figure 2). Thus it is essential that as much information as possible is extracted from the global MAS solar-wind solution obtained from observed photospheric magnetic field. Figure 4 shows the results of considering two independent two-dimensional solar-wind solutions at the latitudes of STA (left) and L1 (right). (Again, the improvement in the solution after assimilation should not be taken as verification of the technique as the observations have been used to provide the solution.) The STA and L1 V R observations are independently mapped back to 30 R using the same methodology as Section 4. At 30 R , DA has modified the MAS solution both at the longitudes of the fast streams, as expected, but also to the slow wind around Carrington longitudes 30 -100 • .
It is desirable to exploit both the global-but-uncertain information in the MAS V R solution and local-but-accurate information in the in situ observations. Thus we seek for the minimum adjustment to the MAS solution at 30 R that gives maximum agreement with the two independent two-dimensional DA solutions, as they are constrained by the in situ observations.
The MAS solution is based on magnetogram observations, where viewing geometry means the polar magnetic fields are poorly constrained. Changes to the strength or configuration of the polar fields will largely have the effect of shifting the equatorial solar-wind structures slightly (e.g., Sun et al., 2011). Thus we search for the (small-amplitude) rotations in latitude and longitude of the MAS solution which produce the best agreement with the DA solutions. We perform this minimisation procedure at 30 R rather than 1 AU, as it will change the solar-wind profile along a given radial cut and thus the subsequent solar-wind interaction between 30 R and 1 AU. For CR2210, the minimum MAE between model and mapped observation V R at STA and L1 latitudes is given by rotating the north pole 4 • down in latitude through a Carrington longitude of 186 • , and further rotating by 9 • in longitude. As can be seen in the top panels of Figure 5, the global change is rather subtle, but as shown by the middle and bottom panels, agreement with the DA solutions is significantly improved at both STA and L1 latitudes. The MAE at STA latitude reduces from 74.5 to 62.9 km s −1 , while at L1 latitude it reduces from 73.3 to 50.1 km s −1 .
The final step is to combine the DA solutions and appropriately rotated MAS solution. The top panel of Figure 6 shows a "blend" of the rotated MAS V R at 30 R with the DA solutions, using a Gaussian weighting in latitude, centred on the latitude of in situ observation. Thus the width of the Gaussian, σ , determines the latitudinal influence of the in situ observations. The optimum value of this parameter will be the subject of future research, Figure 6 A blend of the independent DA solutions to STA and L1 observations with the rotated MAS solution. This example shows σ = 5 • . The top two panels show the latitude-longitude maps of V R at 30 R and 1 AU, respectively. The bottom two panels show the observed and reconstructed V R at L1 and STA latitudes. but for our present purpose, we consider a range of values from 1 to 15 • . As V R structures for this Carrington rotation are approximately coherent over 5 -15 • in latitude, this range seems appropriate for initial study. Figure 6 shows the combined MAS/Upwind/DA blended solution at 30 R using σ = 5 • . The 30 R solution is then propagated out to greater radial distances using the Upwind model, but could in principle be used to drive the full three-dimensional MHD MAS model. In this instance, the 1-AU V R reconstructions are very similar at the L1 and STA latitudes, as the σ value is large compared to the latitudinal separation of L1 and STA.

Solar Wind Speed at PSP
We now extract the V R values at the PSP position for the various reconstructions considered in this study. As seen in the bottom panel of Figure 7, the V R structure is broadly similar, as the original magnetogram-constrained MAS solar-wind solution was already close to the best estimate provided by the in situ observations from L1 and STA.
The blue line in the bottom panel of Figure 7 shows the original MAS MHD value. At PSP perihelion, it predicts a V R of approximately 270 km s −1 . All the Upwind DA methods raise this value to at least 300 km s −1 . The DA methods all also push back the peak of the fast/intermediate speed solar wind from 17/11/18 to 19/11/18. The largest difference between the various DA methods is in the possible intermediate speed stream just after perihelion on 8/11/2018. This feature is the result of the isolated patch of fast wind centred at a Carrington longitude of 340 • and latitude of −7 • at 30 R (i.e., near the PSP orbital loop in the top panel of Figure 6). For small values of σ , this patch is left unaltered and PSP is expected to encounter V R of up to 600 km s −1 . As σ is increased, the slow wind observed at L1 and STA positions is extended further down in latitude, reducing the amplitude of this patch. Note that, for large values of σ , the three-dimensional solution essentially reduces to the two-dimensional approximation, as expected.

Discussion and Conclusions
In this study we have investigated the optimum method to combine magnetogramconstrained solar-wind models and in situ observations through a data assimilative (DA) framework in order to effectively extrapolate in radial distance, Carrington longitude and heliographic latitude. The period of consideration was Carrington rotation 2210, the time of PSP's first perihelion pass in November 2018.
The MAS solar-wind solution, based solely on the observed photospheric magnetic field, provides a good match to the observed large-scale solar-wind features at L1 and STA during CR2210. Looking more closely at the details, however, there are small timing/longitude errors in the fast solar-wind streams. Also, in general, the fast wind streams peak slightly too high and the slow wind is slightly too slow. At the PSP perihelion, the polytropic version of MAS used here predicts very slow solar wind of approximately 270 km s −1 . We note that the thermodynamic version of MAS using the same magnetogram initial conditions predicts much higher solar wind speeds at PSP perihelion of approximately 550 km s −1 , while the AWESoM reconstruction of the same data gives speeds of approximately 360 km s −1 (van der Holst et al., 2019).
Assuming the solar wind close to the solar equator and ecliptic plane can be approximated as a purely two-dimensional steady-state structure, we can use the Lang and Owens (2019) methodology to assimilate both the L1 and STA observations with the MAS solution. At the PSP perihelion this raises the expected solar wind speeds to around 310 km s −1 . However, PSP is separated from the L1 and STA observations by 8 -9 • at this Carrington longitude, meaning the two-dimensional approximation may not be valid.
For this reason, we relaxed the two-dimensional assumption and attempted to extrapolate in latitude from the L1 and STA in situ observations. Ideally, this would be achieved by rigorously characterising the latitudinal covariance of the solar wind and using a formal DA approach. However, this is difficult with the available observations and so a more ad hoc approach was taken. Firstly, the MAS solution is rotated to provide the best match with the observationally-constrained DA solutions at 30 R . The DA solutions and the rotated MAS solution are then blended together. This still requires an estimate of the latitudinal range over which the in situ observations are representative. Given the large-scale solar wind stream structure for CR2210 is typically coherent over latitudinal extents of 5 to 15 • , we expect a value in this range. Larger values converge towards to the two-dimensional approximation. But smaller values produce different V R features at PSP orbit. When the PSP perihelion observations become available, they may provide a means to constrain the optimum value of the latitudinal covariance for future solar wind data assimilation work and forecasting.
The latitudinal representivity of single-point observations may also have time-dependent aspects. Cross-latitude motions of coronal (and hence solar wind) structures may occur as a result of interchange reconnection between closed and open coronal magnetic flux at coronal hole edges (Crooker and Owens, 2011). While this is primarily expected in a driven sense, owing to shear induced by the differential rotation of the photosphere and the rigid rotation of the corona (e.g. Edmondson et al., 2010), it can also occur from random motions of photospheric footpoints of coronal fields (e.g. Rappazzo et al., 2012). Even without interchange reconnection, diffusion of heliospheric magnetic fields may be expected to provide some degree of latitudinal "blending" (Jokipii and Parker, 1968;Matthaeus et al., 1995;Laitinen et al., 2016). Thus, the "age" of the single-point observations may need also need to be factored in to the representivity estimate.
Similarly, we note that due to the limited observations, we have used the steady state approximation throughout, as this allows us to simply convert between time and Carrington longitude. For the period of study (CR2210), there is evidence of time evolution, particularly in the fast solar wind streams near the ecliptic plane. In principle, the DA methods outlined here can relax the steady state approximation, though the limited number of in situ observations would leave the problem underconstrained. Solar wind speed information from heliospheric imagers may be useful in this regard, but at present, extracting localised V R information is difficult (Barnard et al., 2019).
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.