Multi-GNSS global ionosphere modeling enhanced by virtual observation stations based on IRI-2016 model

The inhomogeneous distribution of Global Navigation Satellite System (GNSS) stations results in inaccurate vertical total electron contents (VTECs) in global ionosphere maps (GIMs) over areas with large GNSS data gaps. Incorporating VTECs from the International Reference Ionosphere (IRI) model is usually adopted as one approach to mitigate the inaccurate VTECs. However, large and complicated spatiotemporal varying VTEC biases between GNSS and IRI suggest a robust strategy to optimally combine GNSS and IRI VTECs for operational high-precision modeling. Here, we thoroughly analyze the characteristics of VTEC biases between GNSS and IRI-2016 model in different latitudes from 2009 to 2019, and develop an improved functional and stochastic model. An automated assimilation strategy of GNSS and IRI-2016 VTECs is proposed for Shanghai Astronomical Observatory final GIM (SHAG) routine estimation, and the reliability of GIMs in areas with lack of stations is enhanced by attaching Virtual Observation Stations (VOSs) based on IRI-2016 model and VOS bias parameters. Experimental results show that the root-mean-square errors (RMSEs) of SHAG with respect to VTECs retrieved from four independent GNSS assessment stations are reduced by 21.65–53.06% in the large data gaps with the assistance of VOSs. Furthermore, we validated the long-term reliability of SHAG spanned one solar cycle (2009–2019) with International GNSS Service (IGS) final GIMs and satellite altimetry VTECs. Validation results suggest that SHAG is in good agreement with IGS final GIMs, and reliability of SHAG in large GNSS data gap areas is significantly improved by attaching VOSs and biases. This methodology also represents an efficient tool for automated global ionospheric modeling integrating multi-source data.


Introduction
The terrestrial ionosphere is a large collection of free electrons and ions distributed at an altitude of about 60-1500 km above the Earth's surface (Bilitza et al. 2011;Schaer 1999). The dispersive nature of the ionosphere causes the phase advance and the group delay of radio waves. Similarly, total electron content (TEC) or electron density along the propagation path of radio wave is able to be inverted by using Global Navigation Satellite System (GNSS) dual-frequency observations (Alizadeh et al. 2011;Lanyi and Roth 1988 Shandong Jianzhu University, Shandong, China using multi-system and multi-frequency observations provided by GNSS, high-precision global ionospheric models are established (Calabia and Jin 2020;Goss et al. 2019;Hernandez-Pajares et al. 2009;Li et al. 2020;Schaer 1999). In this way, GNSS has been one of the main technologies for modeling, monitoring and studying the ionosphere (Calabia et al. 2022;Feng et al. 2019;Mannucci et al. 1998).
Although the spherical harmonic expansion has been widely adopted in ionospheric modeling, it is not optimal for representing observations of heterogeneous density and quality (Schmidt et al. , 2006. Since the ground-based GNSS stations are inhomogeneously distributed around the world and there are large data gaps in areas such as parts of the ocean, desert and polar region, the reliability of GIMs estimated with GNSS observations is significantly reduced. There are oversized and negative vertical TECs (VTECs) estimated with function-based models and flat VTECs estimated with grid-based models Hernandez-Pajares et al. 2002;Zhang et al. 2013). Several solutions have been proposed to mitigate the inaccurate VTECs in GIMs, including the assimilation of climate models and empirical models such as the International Reference Ionosphere (IRI) model (Bilitza et al. 2017), the Bent model (Bent et al. 1975) and the NeQuick model (Hernandez-Pajares et al. 2002;Hochegger et al. 2000;Mannucci et al. 1998;Wang et al. 2016), attaching inequality constraint (Zhang et al. 2013) and integrating observations of the radio occultation, satellite altimetry and Doppler orbitography and radiopositioning integrated by satellite (DORIS) (Alizadeh et al. 2011;An et al. 2019;Li et al. 2014). However, the computing burdens and the quantity and timeliness of the measured observations restrict the effectiveness of the above methods.
The IRI model is a joint project sponsored by the Committee on Space Research (COSPAR) and the International Union of Radio Science (URSI) with the goal of developing and improving international standard for parameters in the Earth's ionosphere (Bilitza et al. 2011;Rawer et al. 1978). The IRI is the official standard for the ionosphere recognized by International Standardization Organization, URSI, COSPAR and the European Cooperation for Space Standardization, and has been one of the most effective ionosphere models and widely recognized and accepted (Bilitza 2018). Furthermore, the accuracy of the IRI-2016 model has improved by integrating reliable observations from a variety of technologies such as ionosonde, incoherent scatter radar, rocket, topside sounder, in situ satellite, GNSS and GNSS radio occultation. The IRI model is capable of providing reliable ionospheric parameters from the lower boundary to a user-specified upper boundary on a global scale. Therefore, the big gaps of global ground-based GNSS stations can be well filled with IRI VTECs, and the reliability of GIM has been significantly improved (Hernandez-Pajares et al. 2002;Wang et al. 2016). However, the calibration methods are still subject to in-depth study, while GNSS is only used for validation (Bilitza and Reinisch 2021). Wang et al. (2016) pointed out that there is no very good agreement between the IRI VTECs and the observed values, especially during geomagnetic storms. The VTEC biases between the IRI-2016 and IGS final GIM (IGSG) mainly range from -10 to 10 TEC unit (TECu), and the maximum is larger than 30 TECu over the equatorial ionization anomaly regions (Chen et al. 2020). Therefore, it is not appropriate to directly integrate the VTECs from IRI and GNSS in ionospheric modeling. Since IRI is an empirical model, the method of artificially attaching prior noise to IRI VTEC is commonly adopted for integrating the VTECs of IRI and GNSS Wang et al. 2016). However, since the biases between the IRI and GNSS are not very clear at present, and the ionosphere varies greatly in different times and spaces, it is difficult to accurately express the priori noises with a simple functional model.
In this study, in order to mitigate the inaccurate VTECs in GIMs caused by the inhomogeneous distribution of global GNSS stations, a method of automated assimilation of multi-GNSS and Virtual Observation Stations (VOSs) observations assisted by spatiotemporal varying parameters and priori constraints in global ionospheric modeling is proposed. This study consists of five parts. Section 2 introduces the algorithms of deriving and smoothing slant TECs (STECs) from GNSS observations, analyzes the functional and stochastic model of the VTEC biases between GNSS and IRI-2016, elaborates the strategies of establishing VOSs with the help of IRI-2016 model and generates biases for each VOS, and provides theories of spherical harmonic expansion and parameter estimation. Section 3 presents the effectiveness of attaching VOSs and biases by comparing the estimated Shanghai Astronomical Observatory final GIM (SHAG) with assessment stations, IGSG and the actual calculated VTEC biases between IGSG and IRI-2016. Section 4 validates the long-term reliability of SHAG spanned about one solar cycle with IGS final GIMs and of Jason-2/-3 VTECs. Section 5 summaries the study and draws conclusions.

Strategies of global ionosphere modeling
This section introduces the strategy of retrieving and smoothing TECs from GNSS observations, analyzes the functional and stochastic model of the GNSS and IRI-2016 VTEC biases, proposes the method of establishing VOS and introduces the spherical harmonic expansion and the least squares method used in SHAG estimation.

Deriving and smoothing STECs from GNSS observations
The pseudorange and carrier phase observations of GNSS on a specific frequency can be expressed as follows: where P s r and L s r are pseudorange and carrier phase observations between satellite s and receiver r ; ρ s r is the geometry distance; c is the speed of light; t s and t r are clock errors of satellite and receiver, respectively; I s r is the ionospheric delay of pseudorange and advance of carrier phase along line of sight (LOS); T s r is the tropospheric delay along LOS; B s and B r are pseudorange delays of satellite and receiver, respectively; b s and b r are carrier phase delays of satellite and receiver, respectively; N and λ denote the ambiguity and the wavelength of carrier phase, respectively; and ε P and ε L are other errors of pseudorange and carrier phase, such as observation noise and multi-path error.
Ignoring ionospheric higher-order terms and only preserving the first-order term, I s r can be expressed as follows (Bassiri and Hajj 1993): where e and m are the electron charge and mass; ε 0 is the permittivity of the free space; STEC s r is the TECs along LOS; and f is the frequency of carrier.
Neglecting relatively small errors, such as the multi-path error, observation noise and carrier phase delay of satellite and receiver, the dual-frequency geometry-free linear combinations of pseudorange and carrier phase are expressed as follows (Lanyi and Roth 1988): where P4 s r , i and L4 s r , i are geometry-free linear combined values of pseudorange and carrier phase at epoch of i; L A and L B correspond to two frequency points of GNSS; and DCB s and DCB r are the differential code biases (DCBs) of satellites and receivers.
Because P4 s r contains relatively large observation noise and multi-path error, the carrier phase to code leveling (CCL) method (Mannucci et al. 1998(Mannucci et al. , 1993 is used to reduce the effects of noise and multi-path errors: where L4 s r is the average of a continuous arc which contains the inter-frequency satellite-plus-receiver bias (SPR) and ambiguity information; n represents the epochs of a continuous arc; SPR s r is the sum of satellite DCB and receiver DCB; and L4 s r , i is the smoothed geometry-free linear combined value at epoch i. Note that the effect of smoothing largely depends on the length of a continuous arc (Wang et al. 2015). Hence, only the GNSS satellites arcs with more than 30 min of continuous observation time are employed in ionospheric modeling.
IGS provides global multi-system and multi-frequency GNSS observations from over 500 ground-based GNSS stations (Montenbruck et al. 2014). Compared with global ionospheric modeling with GPS or GPS/GLONASS observations adopted by most of IGS Ionosphere Associate Analysis Centers, better coverage of ionospheric pierce points (IPPs) is achieved by integrating observations of multi-GNSS such as GPS, GLONASS, Galileo, BEIDOU and QZSS, which means higher reliability of ionospheric model, especially for high-spatiotemporal-resolution ionospheric modeling (Ren et al. 2016). Therefore, all IGS stations and GPS, GLONASS, Galileo and BEIDOU observations are employed in this study. The station distribution of IGS network and the corresponding tracking GNSS systems (GPS, GLONASS, Galileo and BEIDOU) on April 10, 2021, are shown in Fig. 1. We see that there are large gaps in parts of the ocean, desert and polar region.

Establishing VOSs and attaching bias parameters
Ground-based GNSS stations are inhomogeneously distributed across the Earth's surface and there are large data gaps, which result in a significant reliability reduction of GIMs. We establish fourteen VOSs with the IRI-2016 model to cover the large data gaps (corresponding to VOS-01-VOS-14 in Fig. 2). The location and serial number of VOSs and the IPP distribution of GPS, GLONASS, Galileo, BEIDOU and VOS observations on April 10, 2021, are shown in Fig. 2. Note that VOSs are only set up in the regions with large data gaps, and it is not necessary to set VOSs in regions with relatively small gaps. By entering the parameters such as the latitude, longitude, time and upper height (20,000 km is adopted in this study) of VOS's IPP (shown as Table 1) into IRI-2016 model, the virtual observations of VOS are obtained.
In order to assimilate the GNSS and VOS observations in ionospheric modeling, the functional and stochastic model of the deviation between GNSS and IRI-2016 should be studied in advance. We selected seven small regions in South America where GNSS stations are densely distributed as the test regions, and we calculated the VTEC biases between GNSS and IRI-2016 for each test region. Seven test regions are located in longitudes of 60-70°W and latitudes of 70°S-0°w ith the intervals of 10°, corresponding to R01-R07 in Fig. 2. Since the IGSG is generated by ranking, weighting and combining the GIMs provided by CODE, ESA/ESOC, JPL and UPC, and provides high accuracy and reliability (Hernandez-Pajares et al. 2009, the GNSS VTECs in the seven regions were obtained from IGSG. The IRI-2016 predicted VTECs in the corresponding IGSG grid points were calculated by IRI-2016 model. The VTEC biases of seven test regions between GNSS and IRI VTECs from January 1, 2009, to January 1, 2020, are shown in Fig. 3. The VTEC biases of R-01 (in high latitudes) and R-07 (in low latitudes) during solar comparatively active periods (from January 1, 2014, to April 1, 2014) and quiet periods (from January 1, 2019, to April 1, 2019) are shown in Fig. 4. Figure 3 shows that the difference between GNSS and IRI-2016 VTECs during solar active periods is generally greater than that during solar quiet periods, and the difference in low  latitudes is greater than that in high latitudes. During solar quiet periods, the VTEC biases vary from − 20 to 20 TECu, and mainly range from − 5 to 5 TECu for regions located in high latitudes and from − 10 to 10 TECu for regions located in low latitudes. During solar active periods, the VTEC biases vary from − 30 to 60 TECu, and mainly range from − 10 to 20 TECu for regions located in high latitudes and from − 20 to 30 TECu for regions located in low latitudes. Considering that the reliability of IRI-2016 VTECs is comparatively lower than that of GNSS VTECs and the spatiotemporal distribution of these two kinds of observations have intersection, to mitigate the influence of IRI-2016 VTECs on GNSS VTECs when integrating these two kinds of observations in ionospheric modeling, the priori noises of IRI-2016 VTECs are set to different values according to the geographic latitudes and solar activity levels. The solar radio flux at 10.7 cm index (F10.7) is adopted to characterize solar activity levels. Referring to Fig. 3 and the F10.7 index (ftp://ftp.seismo.nrcan.gc.  Table 2. The priori noise of GNSS pseudorange observation is set to 0.3 m, its weight is calculated by the elevation of satellite, and then, the priori noise of CCL values are determined according to the variance-covariance propagation law (Wang et al. 2016).
As the biases between GNSS and IRI-2016 VTECs are relatively large and present significant diurnal variation under different circumstances (shown as Fig. 4), it is inappropriate to integrate GNSS and IRI-2016 VTECs in ionospheric modeling only with prior noises, and it is difficult to describe the biases by a simple functional model. Because the ionospheric activity is generally assumed to be static over a short time span in ionospheric modeling, a piecewise constant is introduced to VOSs to further assume the VTEC bias between VOSs and GNSS as a constant in a small region and over a short period. Therefore, on the basis of adding priori noise for GNSS and VOS observations, the spatiotemporal varying biases are attached to each VOS with a 1-h interval to automatically absorb the VTEC biases between VOSs and GNSS: where i is the serial number of VOS, t denotes the time, VTEC VOS represents the VTEC of VOS, VTEC GNSS is the VTEC derived from GNSS and Bias VOS is the VOS bias parameter.

Methods of global ionosphere modeling and parameter estimation
The spherical harmonic expansion is well suited to express the spatiotemporal distribution and variation of the ionosphere on a global scale (Li et al. 2014;Schaer 1999;Zhang and Zhao 2019). In this study, we employ the single-layer model (SLM) to parameterize the VTECs of the ionosphere (Mannucci et al. 1998), which assumes all free electrons are in a shell of infinitesimal thickness at a fixed altitude. The modified single-layer model (MSLM) mapping function is adopted to convert the STEC derived from GNSS shown in Eq. (4) to VTEC (Schaer 1999). The two-dimensional spherical harmonic expansion, the mapping function and the VTEC of VOS can be expressed as follows: where β and γ are the geomagnetic latitude and sun-fixed longitude of IPP, respectively; n max is the maximum of degree and is set to 15;P nm is the normalized associated Legendre function of degree n and order m;C nm andS nm are the ionospheric model coefficients to be estimated; MF is the mapping function; z and z are the corrected and uncorrected satellite zenith at a GNSS station, respectively; α is the correction factor set to 0.9782; R is the mean radius of the Earth set to 6378 km; H is the height of MSLM set to 450 km; s denotes the serial number of VOS; t represents the time; and Bias VOS are estimated at 1-h interval to absorb the VTEC biases between GNSS and IRI-2016.
Since the ionospheric information derived from GNSS contains the satellite-plus-receiver biases, the biases are simultaneously estimated with the ionospheric model coefficients. To separate the satellite-plus-receiver biases into the DCBs of satellites and receivers, the zero-mean of satellite DCBs constraint condition (Schaer 1999;Wang et al. 2015) is imposed to each GNSS constellation: where the N denotes the number of satellites for the specific GNSS constellation and S stands for the GNSS constellation, such as GPS, GLONASS, Galileo and BEIDOU.
The least squares method is adopted to solve the parameters to be estimated. The solutions of the parameter vector X and the covariance matrix DXX are as follows: T is the parameter vector to be estimated, among which the I O N consists of the ionospheric model coefficientsC nm andS nm , and it is estimated with a 1-h interval; S P R and V B are the parameter vectors of SPRs and VOS biases, and these are estimated at 1-day and 1-h intervals, respectively; B denotes the design matrix, the design matrix of GNSS station is obtained from Eqs. (4) to Eq. (6), and the design matrix of VOS is obtained from Eq. (6); P is the weight matrix of observations; L denotes the vector of actual observations; B T P B is the normal equation matrix and B T P L is the right hand side of the normal equation system; the B T P B and B T P L are established for each GNSS and VOS station; V represents the vector of the observation corrections;σ 0 is the standard deviation of the unit weight; n is the number of observations; t is the number of parameters to be estimated; DXX is the covariance matrix ofX . After imposing the zero-mean of satellite DCBs constraint condition to SPRs, the DCBs of each satellite and receiver are obtained. The piecewise linear (PWL) is introduced to interpolate consecutive sets of I O N (Schaer 1999). The GIM in the IONosphere map EXchange format (IONEX) is generated when the VTECs and the corresponding RMSEs on specific grid points are calculated with the spatiotemporal resolution of 1-h × 2.5°× 5.0°in time, latitude and longitude, respectively ).

Validation of VOSs and estimated biases
In this section, the effectiveness of attaching VOS in ionospheric modeling are validated with the VTECs of independent GNSS stations and IGSG. The estimated VOS biases are compared with the calculated biases between IGSG and IRI-2016 to verify the validity of adding VOS bias parameters.

Validation of VOSs with assessment stations
In order to validate the effectiveness of VOS and the performance of different intervals of VOS biases during different solar activity periods, two kinds of strategies are adopted for estimating the SHAG on the solar active day (April 10, 2014) and the solar quiet day (April 10, 2017). One was to only use ground-based GNSS observations (without VOS), the other was to attach VOSs and biases in the regions of large data gaps. The bias intervals of VOS are set to 1, 2, 4, 6, 8, 12 and 24 h.
The VTECs retrieved from independent assessment GNSS stations are compared with the SHAG estimated according to the above strategies to verify how much improvement VOSs will bring to the large gaps between the IGS stations, find the optimal interval of bias and validate whether VOSs have negative effects on the regions where the GNSS stations are homogeneously distributed. The distribution of modeling and assessment stations on April 10, 2017, are shown in Fig. 5. The modeling stations were selected from IGS network, and the assessment stations were obtained from University Navstar Consortium (UNAVCO) and Système d'observation du Niveau des aux Littorales (SONEL) networks. Note that ISPA (IGS station) and ILSG were unavailable since January 10, 2019, and December 12, 2017, respectively, and there were no other stations on that regions. Hence, ISPA and ILSG were used as assessment stations to evaluate the performance of VOSs in low-latitude areas. The STECs retrieved from assessment stations can be obtained from Eq. (4) and is expressed as follows: The DCBs of the satellites and assessment stations were estimated together with the modeling stations and different bias intervals (including without VOS) to reduce the influence of DCB on the derived STEC. The STECs of assessment stations with different bias intervals are extracted from Eq. (9) with the corresponding DCBs. The VTECs of SHAG corresponding to the extracted STECs were obtained by interpolating VTEC maps ). After converting the extracted STECs to VTECs with the mapping function in Eq. (6), the VTECs of assessment stations were used to evaluate the effectiveness of attaching VOSs. The bias, the mean value, the STD and RMSE of VTEC in the study are calculated as follows: where the VTEC bias , VTEC MEAN , VTEC STD and VTEC RMS represent the bias, the mean value, the standard deviation (STD) and the RMSE of VTEC, respectively, VTEC is the modeled value, VTEC ref is the reference value of VTEC and n denotes the number of VTEC. Four independent assessment stations [corresponding to Assessment Station (a) in Fig. 5] located in the sparse areas of GNSS stations were selected as references to validate the VOSs and biases, among which the ILSG and ISPA are located at low latitudes, the KEPA is located at middle latitudes and the THUR is located at high latitudes. The VTEC RMSEs of SHAG without VOSs and with VOSs and different bias intervals during solar active and quiet periods with respect to assessment stations are listed in Table 3. We can see that the VTEC RMSEs of assessment stations with VOSs and biases are all significantly better than those without VOS, especially for those station located in low latitudes (ILSG and ISPA) and during the solar active periods. Compared with the SHAG without VOS, the RMSEs of SHAG with VOSs and biases (1-h interval) are reduced by 2.28-4.68 TECu and 0.93-1.83 TECu, corresponding to 21.65-51.61% and 27.74-53.06% during the solar active and quiet periods, respectively.
For SAHG estimated with different bias intervals, the VTEC RMSEs generally decrease with the shortening of bias intervals, and the improvement in reliability of SHAG is particularly noticeable at low latitudes and during solar active periods, with a maximum reduction of 0.95 TECu for SHAG estimated with 1-h interval relative to 24-h interval. However, in middle and high latitudes and during solar quiet periods, the discrepancy in SHAG estimated with different bias intervals is not significant. The RMSEs of SHAG estimated with 1-h interval are occasionally higher than those estimated with higher intervals, but the maximum difference is only 0.21 TECu at THUR during solar active periods. Based on the above analysis, we can see that attaching VOSs and biases in the large gaps noticeably enhance the reliability of SHAG in the large data gap areas, and it is optimal to set bias interval to 1 h.
Four validation stations [corresponding to Assessment Station (b) in Fig. 5] were selected to verify whether VOSs have negative effects on the areas where GNSS stations are homogeneously distributed, among which the APNT and TBTG are located at low latitudes and close to VOSs, the BELE is located at low latitudes and far away from VOSs, and the OSLS is located at middle and high latitudes and far away from VOSs. The VTEC RMSEs of four assessment stations with respect to SHAG with VOSs and without VOSs during solar active and quiet periods are listed in Table 3. We can see that the assessment stations close to VOSs (APNT and TBTG) are slightly affected by VOSs, with the maximum RMSE discrepancy of 0.19 TECu, while the assessment stations far away from VOSs (BELE and OSLS) are not affected by VOSs.
By validating SHAG with independent assessment stations, it can be seen that attaching VOSs are capable of enhancing the reliability of SHAG in the large GNSS data gaps, and attaching bias with 1-h interval to each VOS evidently improves the consistency of SHAG with respect to the retrieved STECs of assessment stations. Moreover, attaching VOSs and biases on the basis of GNSS has no effect on modeling in the regions where the GNSS stations are homogeneously distributed.

Validation of VOSs and biases with IGSG
In order to verify whether attaching VOSs and biases improves the reliability of SHAG, the SHAG without VOS and with VOSs and biases (1-h interval) on a solar relatively active periods (April 10, 2014) and on a solar quiet periods (April 10, 2021) at 22:00 UTC was estimated and compared with IGSG. Figure 6 shows the VTEC maps of SHAG at 22:00 UTC without VOS and with VOSs during different periods of solar activity, and the corresponding VTEC bias maps with respect to IGSG are also depicted. It can be seen that due to the lack of observations, the SHAG without VOS has comparatively large biases with respect to IGSG in the large data gaps, with the maximum biases of 191.20 in the solar active periods and 54.30 TECu in the solar quiet periods, and a wide range of inaccurate VTECs appear in the large data gaps. After attaching VOSs and VOS biases, the inaccurate VTECs in the large data gaps are effectively mitigated. The VTEC morphology of the ionosphere in the equatorial ionization anomaly regions with VOSs and biases (see Fig. 6c, g) is more explicit than that without VOS (see Fig. 6a

Comparison of bias values and the calculated VTEC biases
In order to validate whether attaching VOS biases are capable of absorbing the VTEC biases, the VTEC biases between IGSG and IRI-2016 were calculated and compared to the bias values estimated with 1-h interval. Since the VTEC map interval of IGSG is 2 h, the bias values were collected at the same interval. Figure 7 shows the bias values in the large GNSS data gap region of VOS-03 in 2014 (solar active year) 2017 (solar quiet year) and the calculated VTEC biases between IGSG and IRI-2016 in the corresponding region. It can be seen that the trend of bias values and calculated VTEC biases are quite consistent during both solar active and quiet periods, the magnitudes of them both mainly range from − 10 to 20 TECu in 2014 and from − 5 to 5 TECu in 2017. Moreover, from the enlarged portions of Fig. 7 we can see that bias values show obvious diurnal variation, which is consistent with the calculated VTEC biases. We compared the VTEC values of estimated SHAG (with VOSs and biases), IGSG and IRI-2016 in the large GNSS data gap region corresponding to VOS-05 on a solar relatively active day (April 10, 2014) and a quiet day (April 10, 2017). Note that a GNSS station (ISPA) is located in this region, which is used in IGSG estimation and not used in SHAG estimation, and thus, IGSG is used as a reference.  Table 4. We can see that with the help of attaching biases, SHAG is more consistent with IGSG than IRI-2016. The mean VTEC deviations of SHAG with respect to IGSG are 3.72 and 1.14 TECu in the solar active and quiet days, while those of IRI-2016 with respect to IGSG are 5.57 and 3.21 TECu, respectively. From the above analysis, it can be concluded that the VOS biases are capable of absorbing the VTEC biases between GNSS and IRI-2016.

Validation of SHAG
This section validates the long-term reliability of SHAG with IGS final GIMs and Jason-2/-3 VTECs, and compares the performance of SHAG without VOS and with VOS.

IGS final GIM validation
We ESA/ESOC (ESAG), NRCan (EMRG), CAS (CASG) and WHU (WHUG). Figure 9 shows the diurnal mean VTEC series of all GIMs and diurnal mean VTEC biases and VTEC RMSEs of GIMs with respect to IGSG from 2009 to 2019. It can be seen that although different model and strategies were adopted to generate GIMs, the diurnal mean VTECs of GIMs are consistent and are able to respond to different solar activities. The diurnal mean VTEC biases and RMSEs of SHAG with respect to IGSG are about -2.39-0.78 TECu and 1.09-4.99 TECu, respectively, which are comparable to other GIMs. As the ionospheric activities vary significantly at different latitudes, in order to further verify the reliability of SHAG at different latitudinal bands, we divided the globe from 90°N to 90°S into 6 latitudinal bands spaced 30°apart and calculated the VTEC biases and RMSEs of GIMs (excluding EMRG due to incomplete data) with respect to IGSG at 6 latitudinal bands and global scales from 2009 to 2019, which are summarized in Tables 5 and 6, respectively. Table 5 shows that there is almost no VTEC bias between UPCG and IGSG on the global scale, but the biases vary greatly at different latitudinal bands, with a maximum difference of 2.53 TECu. The VTEC biases of SHAG with respect to IGSG are generally lower than other GIMs, especially in latitudinal bands of 30°N-0°, 0°-30°S and 30°S-60°S, which correspond to the main distribution regions of VOSs, indicating that VOS improves the consistency of SHAG and IGSG to some extent. As illustrated in Table 6, the VTEC RMSEs of GIMs fluctuates greatly at different latitudinal bands, with the maximum variation ranging from 1.38 to 3.48 TECu (ESAG), and the VTEC RMSE is usually better in the northern hemisphere than that in the southern hemisphere and is generally better in high latitudes than that in low latitudes. Compared with other GIMs, CODG has the best performance in VTEC RMSEs at both at different latitudinal bands and global scale. The VTEC RMSE of SHAG ranges from 1.57 to 2.90 TECu, which is in good agreement with the other GIMs.

Jason-2/-3 VTEC validation
The dual-frequency (Ku-/C-band) observations of satellite altimetry such as TOPEX, Jason series can be used to estimate the VTEC from the Earth's surface (mainly over the surface of ocean and ice) to the orbit altitude of the satellite (Imel 1994). Comparing to TEC from GNSS, satellite altimetry TEC provides an independent reference in the ocean, especially in large GNSS data gaps, and has been widely adopted to verify the reliability of GIM (Hernández-Pajares et al. 1999;Hernandez-Pajares et al. 2017;Mannucci et al. 1998;Roma-Dollase et al. 2017). The Level-2 high-precision geophysical data records (GDRs) of Jason-2 and Jason-3 are adopted to extract VTECs and validate SHAG. To obtain high-precision VTECs from Jason-2 and Jason-3 observations, the observations reflections on the ice, isolated and jump points are rejected, the observations are smoothed with a sliding window of n = 16 consecutive samples, and then, we convert the vertical phase ionospheric delay in Ku band (I ku , for f ku = 13.575 GHz) to the VTEC of satellite altimetry (Roma-Dollase et al. 2017). Jason-2 and Jason-3 have the ground track with a 254 pass and 10-day cycle, and the distribution of IPPs and VTEC values of Jason-3 in period 143 (from December 26, 2019, to January 5, 2020) obtained by this strategy are shown in Fig. 10.
After obtaining the VTEC from satellite altimetry, the VTEC from GIM is calculated by interpolating TEC maps . The VTEC biases, STDs, and RMSEs of GIMs (including the SHAG without VOS) with respect to Jason-2 (from 2009 to 2019) and Jason-3 (from 2016 to 2019) were calculated using Eq. (10). The biases of GIMs with respect to Jason-2 and Jason-3 VTECs are shown in Fig. 11, and the STDs and RMSEs are summarized in Table  7.
Due to the different orbit altitudes of GNSS satellites (about 20,000 km) and Jason satellites (about 1336 km), there is a bias of a few TECu between VTECs inverted by GNSS and Jason satellites (Jee et al. 2010) and the biases are relatively large during solar active periods and small during quiet periods, which can been clearly seen in Fig. 11. Therefore, the STD, which reflects the stability of bias, is adopted to validate the reliability of GIMs. We can see that the STDs of IGS GIMs with respect to Jason-2 and Jason-3 VTECs range from 3.70 to 4.68 TECu and from 2.56 to 3.19 TECu, respectively, among which IGSG and UPCG perform better, and those of SHAG are 4.15 and 2.82 TECu, which are comparable to IGS GIMs. However, the biases of SHAG without the assistance of VOS are relatively less stable, especially during solar active periods, with the STDs of 4.66 and 3.03 TECu with respect to VTECs of Jason-2 and Jason-3, respectively, and the RMSE of SHAG with respect to Jason-2 VTEC is 4.68 TECu, which is also larger than that of SHAG with VOSs and most of IGS GIMs such as the IGSG, UPCG, WHUG, CASG and CODG.
In order to validate the reliability of SHAG in the large GNSS data gaps, we compared the GIMs in specific regions corresponding to the VOS-01, VOS-02 and VOS-03 with respect to Jason VTECs, and the biases of GIMs are shown in Fig. 12 and the STDs and RMSEs are summarized in Table 8. It can be seen that the bias of SHAG without VOS in the large GNSS data gaps is evidently larger than that of other GIMs, and the biases exceed to 20 TECu during solar active periods. The STDs of SHAG without VOS with respect to Jason-2 and Jason-3 VTECs are 6.84 and 3.77 TECu, respectively, which are significantly worse than those of SHAG with VOSs and IGS GIMs. By attaching VOSs and VOS biases in SHAG estimation, the STDs of SHAG with respect to Jason-2 and Jason-3 are reduced to 3.27 and 2.30 TECu, respectively, and the RMSEs of SHAG are also dramatically reduced, indicating that VOSs enhance the reliability of SHAG to the same level as that of IGS GIMs.

Conclusions
For the purpose of mitigating the inaccurate VTECs in GIMs caused by the inhomogeneous distribution of GNSS stations, we proposed a method of attaching VOSs based on IRI-2016 model in global ionospheric estimation using GPS, GLONASS, Galileo and BIEDOU observations. Since there are relatively large and complicated spatiotemporal varying VTEC biases between GNSS and IRI VTECs, which makes it inappropriate to directly assimilate these two kinds of VTECs, we thoroughly analyze the characteristics of VTEC biases between GNSS and IRI-2016 in different latitudes from 2009 to 2019, and develop an improved functional and stochastic model of VTEC bias. The priori noises of IRI-2016 VTECs are set from 5 to 30 TECu depending on the geographic latitude of IPP and solar flux. The spatiotemporal varying biases, which are estimated as piecewise constants with an interval of 1 h, are attached to VOSs to mitigate the VTEC discrepancy between GNSS and IRI-2016. The independent assessment GNSS stations, IGSG and the calculated biases between IGSG and IRI-2016 were adopted    The IGS final GIMs and satellite altimetry VTECs of approximately one solar cycle (2009-2019) were used as the references to verify the long-term reliability of SHAG. Validation results indicate that: 1. The diurnal mean VTEC biases and RMSEs of SHAG with respect to IGSG are about − 2.39 to 0.78 TECu and 1.09-4.99 TECu, respectively, which are comparable to those of IGS GIMs on global scales. Moreover, the VTEC biases of SHAG with respect to IGSG are generally lower than that of other IGS GIMs in latitudinal bands of 30°N-0°, 0°-30°S and 30°S-60°S, which correspond to the main distribution regions of VOSs.
2. The STDs of IGS GIMs with respect to Jason-2 VTECs from 2009 to 2019 and Jason-3 VTECs from 2016 to 2019 range from 3.70 to 4.68 TECu and from 2.56 to 3.19 TECu, respectively, and those of SHAG are 4.15 and 2.82 TECu, respectively, which implies that the reliability of SHAG is comparable to that of IGS final GIMs. In particular, the STDs of SHAG without VOS with respect to Jason-2 and Jason-3 VTECs in the large data gaps are 6.84 and 3.77 TECu, respectively, which are evidently worse than those of IGS GIMs, and the biases exceed to 20 TECu during solar active periods. With the help of attaching VOSs and biases, the STDs of SHAG with respect to Jason-2 and Jason-3 are reduced to 3.27 and 2.30 TECu, respectively, which means that the reliability of SHAG is enhanced to the same level as that of IGS GIMs.
From the above analysis, we conclude that VOSs are capable of compensating for the inhomogeneous distribution of ground-based GNSS stations in global ionospheric modeling, and VOS biases are capable of absorbing the VTEC biases between GNSS and IRI-2016. The inaccurate VTECs in SHAG are significantly mitigated, and SHAG is in good agreement with IGS final GIMs. Furthermore, the methodology proposed in this study represents an efficient tool for automated global ionospheric modeling integrating multisource data and is also applicable to solve the problem of abnormal results in routine estimation of ionospheric products when partial measured observations cannot be obtained on time. In the next step, we plan to integrate multi-source ionospheric observation data from the satellite altimetry, GNSS radio occultation and low earth orbit satellite into VOS to further improve the reliability of GIM in the large GNSS data gaps.