Improving topside ionospheric empirical model using FORMOSAT-7/COSMIC-2 data

The precise description of the topside ionosphere using an ionospheric empirical model has always been a work in progress. The NeQuick topside model is greatly enhanced by adopting radio occultation data from the FORMOSAT-7/COSMIC-2 constellation. The topside scale height H formulation in the NeQuick model is simplified into a linear combination of an empirically deduced parameter H0 and a gradient parameter g. The two-dimensional grid maps for the H0 and g parameters are generated as a function of the foF2 and hmF2 parameters. Corrected H0 and g values can be interpolated easily from two grid maps, allowing a more accurate description of the topside ionosphere than the original NeQuick model. The improved NeQuick model (namely NeQuick_GRID model) is statistically validated by comparing it to Total Electron Content (TEC) integrated from COSMIC-2 electron density profiles and space-borne TEC derived from onboard Global Navigation Satellite System observations, respectively. The results show that the NeQuick_GRID model can reduce relative errors by 38% approximately when compared to the integrated TEC from COSMIC profiles and by 15% approximately when compared to the space-borne TEC. Furthermore, a long-term statistical analysis during years of both high and low solar activities reveals that grid maps of the scale factor H0 and the gradient parameter g have very similar features, allowing rapid and efficient acquisition of high-precision electron density during different solar activity.


Introduction
It is critical to describe current ionospheric conditions in solving radio communication, radar, and navigation challenges (Kotova et al. 2020). Empirical models of three-dimensional electron density distribution in the ionosphere have been developed for both global and regional applications (Bilitza et al. 2017;Chen et al. 2020;Hochegger et al. 2000;Jakowski and Hoque 2018;Leong et al. 2015). The NeQuick model is a global empirical model that enables quick estimation of three-dimensional electron density distribution as well as the Total Electron Content (TEC) measurements up to Global Navigation Satellite System (GNSS) satellite altitudes (Nava Ren et al. 2020b). GNSS signals generally experience an ionospheric range delay proportional to TEC value, degrading positioning (Angrisano et al. 2013). To mitigate such effects, a thorough understanding, and modeling of the electron density distribution in the ionosphere-plasmasphere system is required (Angrisano et al. 2013;Montenbruck and Rodríguez 2020). The topside ionosphere, extending from the F2-layer peak height (hmF2) to the plasmasphere, provides a large contribution to TEC values and may change in a completely different way compared to the bottom-side ionosphere (Cherniak and Zakharenkova 2016;Zhu et al. 2016). As a consequence, an accurate description of the topside ionosphere is of great importance for space weather applications.
The initial measurement database has a significant impact on the empirical models' performance (Cherniak and Zakharenkova 2016). The relative lack of experimental electron density data hampers the topside ionosphere modeling, leaving limitations to the part above the hmF2 height. Empirical models, like the NeQuick model, have been widely applied in depicting the topside electron density (Coïsson et al. 2006). Since the real physical state is not currently available for accurately describing the plasma scale height, the NeQuick topside model is represented by a semi-Epstein layer with a height-dependent thickness parameter H (Nava et al. 2008). Indeed, the effective scale height is an empirical parameter calculated by fitting measured electron density values with analytical functions. For the NeQuick model, the topside ionosphere is evaluated by space-borne data Cherniak and Zakharenkova 2016;Kashcheyev and Nava 2019), which helps to find possible ways of improvement. Over the years, major efforts have been made to improve the NeQuick topside model, taking advantage of the increasing amount of available data (Pezzopane and Pignalberi 2019;Pignalberi et al. 2018). Kotova et al. (2020) updated the NeQuick empirical ionosphere models using the slant total electron content observed by ground-based GNSS receivers. Recent research on the calibration of the topside NeQuick scale height parameters suggests that a more refined description of these three parameters is needed to accurately describe the topside ionosphere (Pignalberi et al. 2020b(Pignalberi et al. , 2021Themens et al. 2018aThemens et al. , 2018b. Brunini et al. (2011) ingested the ground-based and space-borne data into the NeQuick model by adapting the NmF2 and hmF2 parameters. The current research focuses on understanding the limitations and possibilities of the topside NeQuick model. It is shown that the semi-Epstein layer formulation of the NeQuick topside can be improved by modifying its shape parameter k (Coïsson et al. 2006), namely the scale factor H 0 . COSMIC-2, a six-satellite Taiwan-USA mission, was launched on June 25, 2019. The fully operational COSMIC-2 satellites can produce 5000 high-resolution vertical profiles of bending angles per day. COSMIC-2 satellites also provide data arcs of TEC and vertical profiles of electron density (Schreiner et al. 2020), which support space weather operations and research and can also be used as an increasing data source for updating the NeQuick model.
In this study, we proposed an improved method for updating the height-dependent thickness parameter H in the NeQuick topside formulation, which is simplified into a linear combination of the scale factor H 0 and the gradient parameter g. We further constructed the grid maps of the H 0 and g parameters, respectively, from which the corrected H 0 and g parameters can be interpolated. In this way, an improvement in the representation of the electron density on the topside ionosphere would be achievable.

Reformulation of the NeQuick topside model
The topside analytical formulation of the NeQuick model is represented by a semi-Epstein layer with a height-dependent thickness parameter H , and the topside section of the electron density profile is expressed as follows: with The NeQuick topside scale height H is a function of three empirical factors: H 0 , g, and r. H 0 denotes the scale height at the F2-layer peak height (hmF2); g = 0.125 is the gradient for the entire topside profile; and r = 100 controls the asymptotic behavior of H at infinity (Pignalberi et al. 2021).
In Eq.
(2), r and g parameters are empirically set to constant values, whereas H 0 is modeled as a function of bottom-side parameters, such as the bottom-side thickness parameter and solar activity index (Nava et al. 2008). However, the topside ionosphere exhibits different behaviors compared with the bottom-side part (Themens et al. 2018b).
Let h = h − hm F2, then Eqs. (1) and (2) can be rewritten as: By simplification (Pignalberi et al. 2020a), Eq. (4) can be written that: Equation (5) shows that the topside scale height exhibits a linear dependence on h, with H 0 representing the intercept and g representing the slope. In this way, the NeQuick topside model can be reformulated as follows:

Generation of grid maps for H 0 and g parameters
As described by Pignalberi et al. (2020a), the topside scale height H can be analytically inverted from Eq. (1): So the topside scale height H can be determined for each height h by using measured N e (h), NmF2, and hmF2 values, for example, from radio occultation data of COSMIC-2 satellites. After getting a series of H values for each profile, we can then determine the two essential parameters, H 0 and g, by fitting H parameter. An example of the fitting procedure is shown in Fig. 1. In Fig. 1, the blue points are scale height which is calculated from the COSMIC-2 RO ionospheric profile by applying Eq. (7), the red line is a liner fitting for the blue points, and the green line is the topside scale height calculated by the NeQuick model at the same time and position. It can be seen from Fig. 1 that the topside scale height H derived from the NeQuick model differs a lot from that derived from the COSMIC-2 profile, which means the original NeQuick model cannot accurately describe actual ionospheric states. By applying Eq. (5), the intercept and slope of the fitting line (red line) correspond to H 0 andg, respectively. For the sake of distinguishing, the H 0 and g parameters derived from linear fitting are H 0 _cal and g_cal, and the H 0 and g parameters from the original NeQuick model are H 0 _NeQuick andg_NeQuick, respectively.
To acquire the corrected values of H 0 and g for any given time and location, the NeQuick topside formula (8) is forced to combine the F2-layer peak point (hmF2, NmF2) and the tangent point (h, Ne) provided by COSMIC profiles. This operation is carried out for each pair of co-located and simultaneously measured tangent points, which constitutes the dataset in the next section. Thus, the H 0 _cal and g_cal values are calculated almost all over the world and can be used for generating the corresponding grid maps in this study.
Specifically,H 0 _cal and g_cal values are modeled as a function of foF2 and hmF2 by applying the two-dimensional binning approach proposed by Pignalberi et al. (2018). The foF2 values can be converted from NmF2 values: where foF2 is the ordinary cutoff frequency of the F2 layer in Hz and NmF2 is the peak electron density of the F2 layer in el/m 3 . The exact modeling procedures are as follows (Pezzopane and Pignalberi 2019): (1) For every valid COSMIC-2 RO ionospheric profile, H 0 _cal and g_cal can be calculated, foF2 and hmF2 also can be obtained by reading profile data. (2) Each H 0 _cal and g_cal values are plotted in a scatter map according to the (hmF2, foF2) coordinate. (3) Two-dimensional grids are divided with a bin width of 0.25 MHz and 5 km for foF2 and hmF2, respectively. foF2 varies from 0 to 12 MHz and hmF2 varies from 180 to 420 km. (4) Calculate the median of H 0 and g values falling inside each bin when the number of values is greater than or equal to 10. Otherwise the bin is considered statistically not significant.
After establishing the two grid maps of H 0 and g, one can firstly interpolate the H 0 and g parameters by the known NmF2 and hmF2 values, then calculate the scale height H according to Eq. (5), and finally update the NeQuick topside formula, which is referred to as NeQuick_GRID (H 0 , g) topside model. Finally, using Eq. (6) of the NeQuick_GRID (H 0 , g) topside model, one may calculate the topside ionospheric electron density for a particular time and location.

COSMIC-2/FORMOSAT-7 radio occultation data
As a COSMIC follow-up mission, COSMIC-2/FORMOSAT-7 was launched on June 25, 2019, into a circular orbit (with 24°of inclination) at around 720 km altitude, consisting of six satellites. COSMIC-2 occultations are mainly distributed from 45°N to 45°S (Schreiner et al. 2020;Yue et al. 2014). COSMIC-2 is equipped with advanced GNSS RO receivers and has provided at least 5000 ionospheric electron density profiles every day since its launch. We utilized the profiles above hmF2 for constructing the NeQuick_GRID model. The topside model performances depend significantly on the quality of the topside profiles, so it is vital to perform a quality control process on COSMIC-2 profiles before modeling (Pedatella et al. 2021). The main quality control procedures are as follows: (1) Remove all occultation data below 150 km and invalid electron density profiles. (2) Smoothen the "burr" phenomena of electron density profiles. The relative residuals of the electron density profiles before and after smoothing are calculated based on Eq. (9) to quantitatively evaluate the intensity of the "burr" phenomena.
where N is the number of tangent points of each profile, and N raw and N smooth are the electron density before and after smoothing, respectively. The profile will be deleted when the bias is more significant than 20%. (3) Remove the profiles with abnormal NmF2 or hmF2. The normal range of NmF2 and hmF2 should vary from 2 × 10 10 el/m 3 to 2 × 10 12 el/m 3 and from 180 to 450 km, respectively.
The datasets used in this study are the electron density profiles (ionPrf format) file from COSMIC-2 level 2 data (https://data.cosmic.ucar.edu/gnss-ro/cosmic2/ postProc/level2/). The period is from the year 2020 DOY 001 to the year 2021 DOY 365. The total number of COSMIC-2 profiles is 3595197, and the number of profiles after the data quality control procedure is 1942179.

Evaluation method and accuracy index
The relative error and root mean square error (RMS) are utilized as indicators to evaluate the improved NeQuick model and the original one with respect to TEC reference values: where TEC ref denotes the TEC value obtained by the COS-MIC occultation profile from the minimum altitude hm F2 (near 150 km) to the maximum altitude h max (near the orbital altitude of the COSMIC-2 satellite, i.e., near 550 km), the formulation is: where TEC ref is the topside ionospheric TEC, h max is the max height of a radio occultation electron density profile, Ne(h) is the electron density for specific height h, Ne(h) is derived from topside ionospheric model or occultation profiles. TEC model denotes the TEC value in the corresponding altitude range based on either the improved NeQuick model or the original one at the time and location of occultation events, respectively.

Grid maps of H 0 and g parameters established and validation
We aim to figure out how the H 0 and g parameters calculated by COSMIC-2 profiles relate to the hmF2 and foF2 parameters. Figure 2a, b shows the scatter plots of the H 0 and g on DOY 211 in 2021, respectively. Each point corresponds to an occultation event, whose values of hmF2 and foF2 are calculated and utilized as the horizontal and vertical coordinates, respectively. Figure 2 reveals that both the H 0 and g parameters strongly correlate with the hmF2 and foF2 parameters.
We employed the two-dimensional binning approach mentioned in Sect. 2.2 to further generate grid maps based on the above H 0 and g scatter plots. Figure 3 shows the corresponding two-dimensional binned grid maps for Fig. 2. The total number of grids in Fig. 3 is 44 × 48 = 2112, and the value of each grid is the median value of all the data in the grid. Grids with less than ten data points will be set to zero. After such eliminations, up to 2000 or more occultation profiles per day are still obtained owing to a large amount of COSMIC-2 data. As a result, the valid grids account for more than half of the full grids.
With the grid maps, the H 0 and g parameters can be easily interpolated at any time and place once the hmF2 and foF2 parameters are obtained in advance. The steps for users to use H 0 and g grid maps are as follows:  The corrected H 0 and g parameters are then used to update the H parameters by Eq. (5), which are then substituted into the NeQuick topside model formula to improve the topside ionospheric empirical model. We call this improved model the NeQuick_GRID model.
Using the corrected H 0 and g parameters from 2020 to 2021, which are interpolated from the two-dimensional grid maps, we refined the NeQuick topside model using Eq. (6). Figure 4 depicts the relative errors of the NeQuick and NeQuick_GRID models from 2020 to 2021. The referenced TEC values are integrated from COSMIC-2 profiles. Table 1 lists the relevant statistical information. In Fig. 4, it is evident that the NeQuick_GRID model has a smaller relative error than the NeQuick model in most cases, proving the performance improvements of the NeQuick_GRID model.
In 2021, the average relative error of the original NeQuick model was 15.84%, while for the NeQuick_GRID model it was 9.69%, with an accuracy improvement of 38.89%. In 2020, the original and improved NeQuick models' relative errors are 17.66% and 10.81%, respectively, with an accuracy improvement of 38.79%. The results suggest that  the method proposed in this paper outperforms the original NeQuick model significantly. It's worth noting that the performances of the NeQuick_GRID model in 2020 and 2021 differed from each other, which could be due to the fact that the COSMIC-2 satellites had unstable orbits and were still in the instrument commissioning stage in 2020. After 2021, all satellites were in their final orbits, and the observation data quality was better. Furthermore, because empirical models such as NeQuick and IRI have difficulty modeling the ionosphere under the very low solar activity conditions of the last two solar activity cycle minima, the different solar activity levels in 2020 and 2021 may be another important reason for the different NeQuick_GRID model performances (Bilitza and Xiong 2021).  Figure 5 shows the RMS values in 2020 and 2021 of the NeQuick and NeQuick_GRID models for the TEC references derived from the COSMIC-2 profiles. The statistical result is listed in Table 2. Figure 5 reveals that the RMS values of the NeQuick_GRID model are almost always smaller than those of the original NeQuick model. In addition, the RMS errors are reduced by 38.50% from 1.74 TECu to 1.07 TECu in 2021 and by 32.58% from 1.78 TECu to 1.20 TECu in 2020. Figure 6 shows the bias histograms of the two models for DOY 208, DOY 210, DOY 212, and DOY 214 in 2021. As shown in Fig. 6, the bias distributions for the NeQuick_GRID model exhibit similarity and are more concentrated and closer to zero. The original NeQuick model, on the other hand, is more likely to overestimate the electron density, with the majority of bias values greater than zero. The NeQuick_GRID model is more accurate than the original NeQuick in describing ionospheric electron density.
Besides utilizing COSMIC-2 profiles for validation, we further assessed the two models with TEC measurements derived from space-borne GNSS observations. It should be noted that the space-borne TEC measurements were obtained after deducting the receiver and satellite differential code biases, which are estimated in advance by establishing the global topside ionospheric model (GTIM). Our earlier work offered a detailed processing approach (Ren et al. 2020a). Figure 7 gives the relative errors of the NeQuick and NeQuick_GRID models with respect to the space-borne TEC values from January 2021 to June 2021. As shown in Fig. 7, in most cases, the relative errors of the NeQuick_GRID model tend to be smaller than those of the original NeQuick model. There were also some cases where the NeQuick_GRID model performed worse. The poorer performance was primarily attributed to the fact that the linear approximation for topside scale height used in this study cannot be assumed to be valid for the entire topside ionosphere because the linear approximation holds until approximately 800-1200 km of altitude, depending on the conditions (Prol et al. 2019(Prol et al. , 2022Pignalberi et al. 2020b). The NeQuick_GRID model is driven by the COSMIC-2 satellite profiles, which can only be used to update the topside NeQuick model from the ionospheric peak height to the maximum height of the profiles. However, the referenced space-borne TEC measurements ranged from the orbital altitudes of LEO satellites to the orbital altitudes of GNSS satellites. On the other hand, the referenced spaceborne TEC values are also likely to be influenced by the errors of the global topside ionospheric model. As a result, the improvements of the NeQuick_GRID model are likely

Analysis of grid maps in different months and solar activity
In this section, we focused on the variations of grid maps for both the H 0 and g parameters during the high and low solar activity in 2014 and 2021, respectively. The grid models of g and H 0 on (hmF2, foF2) for six months in 2021 are presented in Figs. 8 and 9, respectively. that the g parameter closely correlated with the hmF2 and foF2 parameters. On the contrary, the larger values of the H 0 parameter were mostly found in the top right corner, while the smaller ones were mostly found on the left side, as shown in Fig. 9. It can also be seen that the H 0 parameter behaved quite similarly throughout the year 2021. It indicated that the H 0 parameter is also closely related to the hmF2 and foF2 parameters.
The grid models of the H 0 and g parameters during the low solar activity in 2021 are investigated. As shown in Figs. 10 and 11, there are more non-value grids in the maps of the H 0 and g parameters in 2014 compared to those in 2021, which could be attributable to the limited availability of COSMIC-1 satellites in 2014. In addition, in the stronger solar activity year (2014), the values of H 0 and g are generally larger than those in the weaker solar activity year (2021), and since the simplified expression for H is: When z > 1, Ne top (z) is monotonically decreasing about z. That is, the electron density calculated by the NeQuick_GRID model will become larger in years with stronger solar activity. This phenomenon coincides with the phenomenon of increasing electron density due to enhanced solar activity. This difference will lead to the fact that the electron density obtained from NeQuick_GRID calculations will be larger in years with high solar activity, even if the input ionospheric parameters such as NmF2, hmF2, and foF2 are the same. However, since the grids of H 0 and g for different years are derived from the actual measured data, this overestimation should be closer to the actual situation of the measured data compared to the original NeQuick model (as shown in Fig. 6). In Fig. 10, the grid maps in 2014 vary differently, while the grid maps of the H 0 parameter vary similarly. As for Fig. 11, the smaller H 0 values were found on the top, while the larger ones are found mostly on the middle and right sides.
During years of both low and high solar activity, the H 0 and g parameters appear to be extremely related to the hmF2 and foF2 parameters. Furthermore, the general behavior of the H 0 and g grid maps displays certain different characteristics. As can be seen in Figs. 8, 10 and 11, the H 0 and g grid maps seem to have larger differences between years and smaller differences in different months in a single year, so the grid maps of the H 0 and g can be generated monthly for a NeQuick user. If there is enough data, the maps can also be built daily, which is a better choice. In the future, as the radio occultation data increases, these maps could be more accurate and finer and cover a wider range.

Conclusion
This research aims to improve the NeQuick topside model by reformulating the scale height H calculation, which is simplified into a linear function of the two key factors, namely the scale factor H 0 and the gradient parameter g. In addition, both the H 0 and g parameters are modeled as functions of the hmF2 and foF2 parameters. We also used COSMIC-2 data to build the H 0 and g grid maps. One can interpolate the corrected H 0 and the g parameters from the grid maps for updating the topside formulation when the hmF2 and NmF2 values are obtained in advance.
The performance of the improved NeQuick model is validated by the referenced TEC values that are derived from the COSMIC-2 profiles and those derived from the space-borne GNSS observations, respectively. The results show that the improved NeQuick formulation significantly improves the topside ionospheric representation. Compared with the referenced TEC values from the COSMIC-2 profiles, the RMS values of the NeQuick_GRID model were almost always smaller than those of the original NeQuick model. And the RMS values of the original NeQuick model can be reduced by 38.50% from 1.74 TECu to 1.07 TECu in 2021 and by 32.58% from 1.78 TECu to 1.20 TECu in 2020. Compared with the space-borne TEC values, the NeQuick_GRID model performed better than the NeQuick model in most cases. There were also some cases where the NeQuick_GRID model performed worse. The reason is that the linear approximation for the topside scale height used in this study cannot be assumed to be valid for the entire topside ionosphere. It indicated that the linear approximation might not be suitable for altitudes higher than 800-1200 km.
Finally, we investigated the variations of grid maps for the H 0 and g parameters during both the high and low years of solar activity in 2014 and 2021, respectively. In general, the grid model of the g parameter showed very similar variations in different months of the same year. The g parameter had a close correlation with the hmF2 and foF2 parameters. Moreover, the overall behaviors of H 0 and g grid maps reveal different characteristics during different activities. Acknowledgements This work was funded by the National Natural Science Foundation of China (NO.42174031, NO.41904026, NO.42230104) and supported by the Fundamental Research Funds for the Central Universities. The numerical calculations have been done on the supercomputing system in the Supercomputing Center of Wuhan University, Wuhan, China. The source code of the NeQuick model is available from https://t-ict4d.ictp.it/nequick2. Author contributions XR and XZ designed the research; XR, YL and DM performed the research and wrote the paper; XR, YL and DM analyzed the data; WZ and XZ also contributed to the data analysis; XZ gave helpful discussions on additional analyses and result interpretation.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.