Effects of noise on the accuracy of plasma bulk parameters derived from velocity moments of in-situ observations

We expose and quantify the inaccuracies of plasma bulk parameters derived from the calculation of velocity moments of noisy in-situ plasma observations. First, we simulate typical solar wind proton plasma observations, obtained by a typical top-hat electrostatic analyzer instrument. We add background noise to the simulated observations and analyze them by applying standard methods to derive the plasma density, speed, and temperature. We then compare the analysis results with the parameters we use to simulate the observations in the first place, in order to quantify the inaccuracies in the calculated plasma parameters as functions of the noise level in the observations. We find that even noise levels that are smaller than 1% of the signal peak, lead to significant inaccuracies in some plasma parameters. The plasma temperature suffers the biggest inaccuracies and the plasma speed the smallest. Our results highlight the importance of removing noise from observations when calculating the moments of the constructed plasma distributions. We finally, evaluate one simple method to remove uniform background noise automatically from measurements, which is useful for future on-board analyses.


Introduction
The velocity distribution functions (VDFs) of space plasma particles are typically constructed from in-situ plasma observations. A proper analysis of the constructed VDFs determines the bulk parameters of the plasma. One popular analysis method is the fitting of analytical functions, parameterized by the bulk parameters, to the constructed VDFs or the raw observations. In this method, the bulk parameters are determined by the best "fit" of the model function to the measurement. Several studies applied fitting algorithms to in-situ observations in order to estimate bulk parameters in several space regions, such as the Jovian magnetotail (e.g., Nicolaou et al. 2014Nicolaou et al. , 2015, Saturnian magnetosphere (Livi et al. 2014;Wilson et al. 2017), cometary plasma electrons (Broiles et al. 2016), and the solar wind (e.g., Kasper 2003;Ebert et al. 2009;Elliott et al. 2016;Nicolaou et al. 2021;Abraham et al. 2022).
Another popular method for estimating plasma bulk parameters from in-situ observations, is the calculation of the VDF's velocity moments. This method estimates the average (bulk) parameters of the plasma via numerical integration of the constructed VDFs, weighted by velocity dyadic products. For instance, the zeroth order moment calculates the plasma density, while the first and second order moments calculate the plasma bulk velocity and pressure (or temperature), respectively. Compared to the fitting method, the calculation of moments is faster, demanding much less computational time. This is because, unlike the fitting method, the calculation of moments does not require the evaluation of an analytical function. Some studies have calculated the velocity moments of VDFs, constructed in several environments, such as the Saturnian magnetotail (Dialynas et al. 2018), Jovian magnetosphere (Mauk et al. 2004), and the solar wind (e.g., Marsch et al. 1982;Kasper 2003;Nicolaou et al. 2021).
The fact that the calculation of moments demands very low computational time, it makes it suitable for on-board data processing. This is extremely beneficial in missions requiring high time resolution measurements, or when the plasma parameters must be calculated and downlinked very fast (e.g., on space weather monitoring missions, Nicolaou et al. 2020a). However, if the original measurements are not downlinked, we cannot construct the plasma VDFs on-ground and evaluate the accuracy of the derived plasma parameters. Therefore, it is important to evaluate our processing methods, using models of realistic plasma and measurement conditions. Background noise is one factor that can result in erroneous calculations of plasma parameters, depending on the signal-to-noise ratio level. A complete evaluation of the derived moments should quantify the effects of noise, which are often overlooked.
Here, we examine the inaccuracies of calculated moments using simulations of in-situ plasma observations of solar wind proton particles, with a uniform background noise. The comparison between the simulated observations analysis results and the plasma model input, allows us to expose and quantify the effects of the background noise to the final product of the analysis. In Sect. 2, we explain the method we follow to simulate proton plasma observations with background noise, obtained by a typical electrostatic analyzer design. Further, we explain how we calculate the plasma bulk parameters via the velocity moments of the velocity distribution functions constructed from the modeled measurements. In Sect. 3, we show the comparison between the output moments and the input parameters for different noise levels. This comparison quantifies the accuracy of the derived bulk parameters. In Sect. 4, we discuss our results and we evaluate a novel method to remove the uniform, background noise form the measurements. Finally, Sect. 5 summarizes our study and lists our conclusions.

Methodology
First, we consider a typical top-hat electrostatic analyzer instrument, obtaining measurements of space plasma particles. Our concept instrument is similar to the design of Solar Wind Analyser's Proton Alpha Sensor (SWA-PAS, Owen et al. 2020) on board Solar Orbiter and measures number of particles C in specific energy steps E, elevation directions Θ, and azimuth directions Φ. As we show in Fig. 1, the elevation direction Θ is defined by the angle between the particle velocity vector and the top-hat plane (x-y plane) of the instrument, while the azimuth direction Φ is defined by the angle between the particle velocity vector projection on the top-hat plane and the x-axis. The field-of-view (FOV) of the instrument covers elevation directions from −22.5°t o +22.5°in 9 Θ steps, while the azimuth direction is resolved in 11 azimuth sectors, covering the range from −24°t o +42°(see Fig. 1). Finally, our concept instrument measures particles with energies between 200 eV and 20 keV. This energy range is covered in 96, exponentially spread steps E (see also Owen et al. 2020 and the instrument models by Nicolaou et al. 2018Nicolaou et al. , 2019.
The expected number of counts in each instrument pixel is where G f is the instrument's geometric factor, which we consider constant for all E, Θ and Φ. In practice, the geometric factor is determined form laboratory calibrations (e.g., Owen et al. 2020). τ is the acquisition time for each measurement (obtained at each E, Θ, Φ) and f (E, Θ, Φ) is the VDF of the plasma particles, with the particle velocity vector expressed in E, Θ, Φ. The formula in Eq.
(1) assumes that the VDF does not vary over the E, Θ, Φ pixel acceptance bandwidth (e.g., Nicolaou et al. 2019Nicolaou et al. , 2020aNicolaou et al. ,b, 2021. Furthermore, the VDF is assumed to be invariant with time over the full E, Θ scan of the instrument. For our simulations, we consider solar wind plasma protons with their velocities − → U (or energies) following a Maxwellian distribution function: where N is the number density of the simulated protons with mass m ∼ 1.67 × 10 −27 kg, T is their temperature, E 0 their bulk energy, ω (Θ, Φ) the angle between the par- and the bulk velocity vector , and k B is the Boltzmann constant. We simulate plasma protons with input bulk parameters N = N in = 35 cm −3 , U 0 = U 0,in ∼ 440 km s −1 (E 0 = 1 keV) towards Θ = 0 and Φ = 9 , and T = T in = 20 eV, respectively. This set of input parameters models typical solar wind protons at ∼1 au (e.g., Barouch 1977).
To account for the statistical measurement error, we model the signal counts in each E, Θ, Φ pixel of the instrument, assuming that each measurement C signal appears with a probability that follows the Poisson distribution function, such that Then, we add a uniform background noise of a certain level n l to each measurement. The noise also follows the Poisson distribution so that the final measurement in each pixel is Left panels show the logarithm of the number of counts log 10 (C) as a function of the logarithm of the sampled energy log 10 (E) and elevation angle Θ, obtained at the azimuth sector centered at Φ = 9 • . Right panels show log 10 (C) as a function of log 10 (E) and azimuth angle Φ, obtained at elevation bins centered at Θ = 0 • . For the specific plasma parameters, the simulated counts peak at ∼1000 counts. Top panels correspond to noise level n l = 0 counts, middle panels to n l = 1 counts, and bottom panels to n l = 10 counts. Typically, we construct the observed particle VDF f out , using the inverse formula in Eq. (1), such that We then calculate the statistical moments of f out from which we get the plasma bulk parameters (see also Nicolaou et al. 2020a). We calculate the density as: while the velocity components are and respectively, and we calculate the scalar temperature as: where In the above formulas, E, Θ, and Φ are the differences between consecutive E, Θ, and Φ pixels, respectively. Finally, we quantify the accuracy of the derived moments by investigating the ratios N out /N in , U out /U in , and T out /T in for different input noise levels. In the next section we show how these ratios deviate from unity for increasing noise levels, revealing potential misestimations of the plasma bulk parameters when those are estimated from the moments of the constructed plasma VDFs.

Results
In the top panel of Fig. 3, we show histograms of N out , U out , and T out for 1000 simulated measurements for the in-  Fig. 3, show the histograms of N out /N in , U out /U in , and T out /T in for the same simulated samples. According to this result, even when the noise level is considerably smaller than the peak of the signal (∼1000 counts), the calculation of the VDF moments misestimates significantly the plasma temperature (by a factor of ∼1.37).
In Fig. 4, we plot the ratios N out /N in , U out /U in , and T out /T in for different noise levels, using the same input plasma conditions in our simulations (N in = 35 cm −3 , − → U in ∼ 440 kms −1 along Θ = 0 • and Φ = 9 • , and T in = 20 eV). The main plots show the noise level in logarithmic scale. Inside each panel however, we show the same plot, using a linear scale. First, our results show that for non-zero noise, all the plasma parameters are overestimated. This is expected, considering that the E steps of our instrument increase exponentially and the uniform background noise will be misinterpreted as part of the VDF, increasing the overall VDF, especially in the high energy range. Second, we find that among the three bulk parameters we examine here, the Fig. 3 Histograms of (top) N out , U out , and T out , and (bottom) N out /N in , U out /U in , and T out /T in , derived from the analysis of 1000 simulated measurement samples considering plasma with N in = 35 cm −3 , U in ∼ 440 kms −1 along Θ = 0 • and Φ = 9 • , and T in = 20 eV and noise level n l = 0.1 counts plasma temperature suffers the largest inaccuracies, while bulk speed is affected the least. More specifically, the temperature is overestimated by a factor of ∼18 when the noise level is ∼10 counts (1% of the signal peak). For the same n l , the calculated density is overestimated by a factor of ∼2.3 and the bulk speed by a factor of ∼1.03. Interestingly, even a typical noise level, such as 0.1% of the peak (n l = 1 count in our simulations), has a significant impact on the calculations of the bulk parameters of the plasma we model here.
The misestimation of the bulk parameters depends on the input plasma parameters as well, in a rather complicated way (se also Nicolaou et al. 2018Nicolaou et al. , 2019Nicolaou et al. , 2020a. Here, we do not provide a detailed quantification of the accuracy of the calculated moments for a wide range of plasmas. Instead, we want to expose potential inaccuracies that can result from noise, even when the noise level seems insignificant compared to the signal. In addition, we demonstrate methodologies suitable to quantify the noise effects for any input plasma VDF, observed by any other instrument operating similarly to our concept instrument. In the next section, we discuss our results further and we evaluate a method that reduces significantly potential inaccuracies.

Discussion
As it is expected, background noise in in-situ measurements of plasma particles, leads to erroneous calculations of the velocity moments of the constructed plasma particle VDFs. We quantify the errors of the calculations as functions of the noise level, using simulated plasma observations of a typical solar wind plasma. Figure 4 shows that there is a linear relationship between the estimated plasma density and the background noise. By combining Eqs. (6) and (7), we get N out as a function of the measured counts: which is the same equation as in Nicolaou and Livadiotis 2020, expressed in terms of particle energy instead of particle speed. Since C (E, Θ, Φ) = C signal (E, Θ, Φ) + C noise (E, Θ, Φ), we split in two terms: The first term on the right-hand side is the actual plasma density (N in ), while the second term is the contribution from noise. We can then express N out /N in as: For the cases we examine here, C noise is a function of a uniform n l , as shown in Eq. (4). Then, for the same input plasma parameters, we expect a linear relationship between N out /N in , on average. In Fig. 5, we show our model results for N out /N in , along with the analytical function in Eq. (15). There is a perfect agreement between the model and expected result. If someone follows a similar algebra for the higher order moments, will find that the rest of the plasma parameters do not have a linear relationship with the noise level, exactly as shown in Fig. 4. We find that the plasma temperature is significantly overestimated even for noise levels that are by >1000 times smaller than the peak of the measured counts. This is reasonable, considering that the temperature moment is the mean kinetic energy of the particles in the plasma rest frame and background noise in energy and angular pixels away from the VDF peak is misinterpreted as measurement of particles with high kinetic energies, increasing the mean energy value. This misestimation can be reduced by using beamtracking methods (e.g., De Keyser et al. 2018), or any similar method which excludes from the analysis pixels away from the VDF peak. The successful use of these techniques in data-analyses depends on the ability to select the optimal energy and angular range of the observations. Nevertheless, even with the proper selection of pixels to analyze, we do not solve the problem entirely, as the analyzed parts of the VDFs are still polluted by noise.
Our results indicate that the calculation of the statistical moments requires a proper treatment of the observations in order to minimize the effects of noise, even when the noise level seems insignificant. Although on-ground analyses can investigate the 3D data properly and come-up with sophisticated noise removal tools, some missions require on-board moments calculations to enable high-time resolution science within the available telemetry, or to provide early solar wind estimations for space weather forecasting.
One idea is to measure the noise level on-board, using a background anode (e.g., McComas et al. 2017), which although operates regularly, it has a blocked FOV. Additionally, we could perform on-board estimations similar to previous on-ground analyses, which have estimated background noise and ultra-violet contamination by calculating the median number of counts over the measurement samples (e.g., Nilsson et al. 2012;Rojas-Castillo et al. 2018). This robust approach is reasonable when the useful signal is not spread over the majority of the instrument's pixels. In cases where (Right) From top to bottom N out /N in , U out /U in , and T out /T in as functions of the input n l , calculated after removing the estimated noise level from the observations the signal is spread over a large portion of the instrument's energy and angular range, the median value will be affected by the signal and will be larger than the actual background counts.
Here, we propose to estimate the noise level from the average number of counts obtained in an energy range, in which we don't expect to measure any particles (e.g., largest E step). Typically, the energy tables of plasma instruments include E steps beyond the expected energy range of the measured particles. If the noise is quasi-uniform over the entire energy and angular range of the instrument, the noise level estimated from the background counts in a limited energy range is a representative for the noise level in every E, Θ, Φ pixel. We validate such a method by estimating the noise level as the average number of counts obtained at the highest energy sampled by our concept instrument (96th energy step E 96 = 20 keV). In this case In the left panel of Fig. 6, we show the estimated noise level as a function of the input noise level for simulations considering N in = 35 cm −3 , − → U in = 440 kms −1 along Θ = 0°and Φ = 9°, and T in = 20 eV. Each data-point is the mean value over 1000 samples with the same input n l , while the error bar is the associated standard deviation. All the data-points are along the y=x line (red dashed), confirming that the calculated noise level is accurate. Therefore, we can construct a corrected VDF using Eq. (6), after we subtract the estimated noise level from the raw counts, such that The moments are then calculated exactly as in Eqs. (7)- (12), using f out,corr. The right panels of Fig. 6, show the ratios N out /N in , U out /U in , and T out /T in as functions of the input n l , calculated after subtracting the estimated noise level from the measured (simulated) counts. The data-points correspond to the mean values over 1000 samples, while the error bars are the associated standard deviations. With this analysis, the mean values of the estimated bulk parameters do not deviate significantly from the input, which is great improvement compared to the analysis of the raw counts. However, there is a statistical uncertainty (error bars) increasing with n l . Our methodology can estimate this uncertainty for any similar application in space and judge if it is a threat for a specific mission concept and goals. Such evaluations are crucial for the success of missions employing in-situ plasma observations. This study uses the concept of the uniform background noise. In real life however, the background noise may depend on energy and/or particle direction. For instance, observations by the electron sensor of the Jovian Auroral Distributions Experiment (McComas et al. 2017) on board Juno, reveal background noise in energies above ∼1 keV, caused by penetrating radiation and the detector's internal noise. The noise level peaks at 2 Hz. In another example, Ion Mass Analyzer on-board Mars Express (Barabash et al. 2006) shows two types of noise, an independent of energy and elevation direction, and an energy dependent one. The Ion Composition Analyser (ICA, Nilsson et al. 2007) has similar noise profiles and the noise background increases as the spacecraft gets closer to the Sun. In such cases, if the noise profile is not characterized, or cannot be characterized by laboratory calibrations, someone may be able to construct it using flight observations obtained in range of energies, angles, and environment conditions. Finally, it is important to note that noise does not have impact only in the calculation of statistical moments. Nicolaou et al. 2022 use simulations and laboratory experiments to demonstrate and quantify the effects of background noise in the accuracy of plasma parameters, derived by fitting noisy observations with model distributions.

Summary and conclusions
We model observations of typical solar wind protons, obtained by a typical electrostatic analyzer. We consider different levels of background noise in the observations and we use a standard analysis method to construct the underline proton velocity distribution functions. We then calculate the velocity moments of the constructed VDFs to determine the proton density, bulk speed, and temperature. Our analysis concludes that: • The presence of noise in typical in-situ measurements of 3D velocity distribution functions of plasma particles, leads to misestimations of the statistical moments. Even small noise levels (<1% of the signal peak) cause significant inaccuracies. Therefore, derivations of plasma moments must pay extra caution in reducing the effects of noise; • From the bulk parameters we examine here, plasma temperature is affected the most by noise, while bulk speed is affected the least; • In the case of uniform background noise, we can fairly estimate the noise level from the average number of counts obtained in an energy where no signal is expected. Then, the significant inaccuracies we demonstrate in this study can be reduced by subtracting the estimated noise level from the raw measurements.