Statistical models to provide meaningful information to GNSS users in the presence of ionospheric scintillation

Ionospheric scintillation is one of the most challenging problems in Global Navigation Satellite Systems (GNSS) positioning and navigation. Scintillation occurrence can not only lead to an increase in the probability of losing the GNSS signal lock but also reduce the precision of the pseudorange and carrier phase measurements, thus leading to positioning accuracy degradation. Statistical models developed to estimate the probability of loss of lock and Geometric Dilution of Precision normalized 3D positioning errors as a function of scintillation levels are presented. The models were developed following the statistical approach of nonlinear regression on data recorded by Ionospheric Scintillation Monitoring Receivers operational at high and low latitudes. The validation of the probability of loss of lock models indicated average correlation coefficient values above 0.7 and average Root Mean Squared Error (RMSE) values below 0.35. The validation of the positioning error models indicated average RMSE values below 10 cm. The good performance of the developed models indicates that these can provide GNSS users with information on the satellite loss of lock probability and the error in the 3D position under scintillation.


Introduction
The earth's ionosphere is the single largest contributor to the Global Navigation Satellite System (GNSS) positioning error budget. In particular, a phenomenon known as Ionospheric scintillation, characterized by fluctuations in the GNSS signal amplitude and phase, may seriously disrupt satellite tracking and degrade system accuracy, reliability and integrity (Kintner et al. 2001). The day-to-day variability in scintillation occurrence and its dependence on local time, season, latitude, longitude, solar and geomagnetic activity are well known and discussed in Basu et al. (2002) and Kintner et al. (2007). The two global regions where scintillation occurs predominantly are the equatorial to low latitude regions, extending from about 20°N to 20°S geomagnetic latitudes, and the high latitude regions, extending from 65° to 90° geomagnetic latitudes. However, in these two regions, the processes governing the generation of scintillation causing plasma density irregularities are quite different, thereby leading to significant differences in the observed characteristics of the scintillation effects (Jiao and Morton 2015).
Ionospheric scintillation can significantly impact the GNSS receiver signal tracking performance. Rapid amplitude fluctuations associated with scintillation can cause the signal-to-noise ratio to drop below the receiver threshold, and when this depth of fading exceeds the fade margin of the receiver, signal loss and cycle slips occur (Akala et al. 2012). A measure of the amplitude scintillation is provided by the S4 index, defined as the standard deviation of the received signal power normalized to the average signal power. The rapid phase fluctuations associated with scintillation may cause the frequency Doppler shift in the signal carrier to exceed the receiver's Phase Locked Loop (PLL) bandwidth, and loss of phase lock may be observed (Humphreys et al. 2005). Phase scintillation is characterized by the SigmaPhi (σ φ ) index, defined as the standard deviation of the detrended carrier phase computed over intervals of 1, 3, 10, 30 and 60 s, based on 50 Hz measurements. Scintillation can increase the tracking jitter variance measured at the output of the PLL and the correlation between scintillation levels and phase tracking jitter at low and high latitudes can be represented by a quadratic fit as shown, respectively, in Sreeja et al. (2012) and Aquino and Sreeja (2013). Statistical models developed to estimate the standard deviation of the receiver PLL tracking jitter as a function of scintillation levels are presented in Vadakke Veettil et al. (2018a). Skone et al. (2001) reported that during strong scintillation the availability of carrier phase observations is limited due to loss of signal lock, with significant impact on positioning applications. Using data from equatorial and auroral regions, Doherty et al. (2003) demonstrated that scintillation can have an adverse effect on receiver performance, resulting in increased cycle slips and losses of lock. Additionally, even if receivers are able to maintain lock during moderate to strong scintillation conditions, the errors due to scintillation propagate to the GNSS measurements, leading to a degradation in positioning accuracy (Pi et al. 2017). Statistical models to estimate the 3D positioning errors as a function of scintillation levels are presented in Veettil et al. (2018b), which demonstrated that if nowcasted scintillation information is available, then the developed statistical models can be used to estimate the nowcasted 3D positioning errors on a global scale.
With the increasing reliance on GNSS for several modern life applications such as connected & autonomous vehicles, infrastructure monitoring, offshore operations, precision agriculture and mapping, precise ionospheric information can contribute to the understanding of the impact of significant ionospheric related events on our technology based society. In this context, the "Ionosphere Prediction Service" (IPS), a project funded by the European Commission (EC) under Horizon 2020, in the frame of the Galileo program, aimed to translate the information on the state of the ionosphere into GNSS user-devoted products through the design and development of an ionospheric prediction service prototype. Further details about the IPS project are available in Vadakke Veettil et al. (2019) and from the project web portal (https ://ips.gsc-europ a.eu).
Statistical models have been developed in the context of the IPS project to estimate the probability of loss of lock and Geometric Dilution of Precision (GDOP) normalized 3D positioning errors at high and low latitudes as a function of the scintillation levels. The development of these models for the first time based on long-term data representing scintillation levels actually observed in the different global regions is presented. The considerable size of the data set and the scintillation levels that it covers provide a large and representative enough sample to ensure the statistical robustness of the models, which are validated in the research. The next section describes the data and methodology adopted in the development of the models, followed by a section presenting the results and discussions. Finally, the conclusions are presented.

Data and methodology
The data collected by Ionospheric Scintillation Monitoring Receivers (ISMRs) in operation at the high and low latitudes during the years of 2012-2015 have been exploited to develop the statistical models to estimate the probability of loss of lock on the Global Positioning System (GPS) L1 frequency and the GDOP normalized 3D positioning errors. Following a statistical approach of nonlinear regression (Seber and Wild 2003), models have been developed incorporating the dependence on scintillation levels, as represented by the scintillation indices and the root mean square of the Rate of change of slant Total Electron Content (STEC) index (ROT rms ). The ROT rms is a commonly used measure of ionospheric activity characterizing rapid variations of TEC and is strongly related to scintillation (Pi et al. 1997;Basu et al. 1999). Two types of ISMRs, namely the Novatel GSV4004B and the Septentrio PolaRxS, and two types of geodetic receivers, namely Topcon NET-G3A and Trimble NETR9 have been used in this study. Table 1 shows the geographic latitude and longitude of the stations along with the geomagnetic latitude and the GNSS receiver type. The number of days during the years of 2012-2015 used in the development of the probability of loss of lock and positioning error models at high and low latitudes is shown in Table 2. The geographic location of the stations is shown in Fig. 1. The two types of ISMR taking part in this study use similar algorithms to provide the scintillation indices, namely S4 and σ φ (Sreeja et al. 2011). The one-minute S4 and σ φ indices recorded on the GPS L1 frequency per satellite-receiver link are used in this study. Both types of ISMR also record the uncalibrated STEC and the differential TEC (dTEC) values at every 15 s per satellite-receiver link (Van Dierendonck et al. 1993; Septentrio PolaRxS application manual 2010). The uncalibrated STEC is referred to as the STEC value estimated without taking into account receiver and satellite differential code biases. The dTEC values refer to the change in TEC over the four 15 s intervals during the last minute estimated using the phase observations of the GPS legacy signals L1C/A and L2P. The ROT rms per satellite-receiver link used in this study is computed by calculating the root mean square (rms) of the four dTEC values provided by the ISMR over each minute so that it matches the time resolution of the scintillation indices. It is to be noted that inconsistencies exist when ROT rms is estimated from multiple GNSS receiver types or different signal combinations of a single receiver (Yang and Liu 2017). The development of the models based on ROT rms is intended to support the estimation of the probability of loss of lock and positioning errors over regions where the scintillation indices or the ISMR data are unavailable. This is because the ROT rms can be computed from uncalibrated STEC values that can be estimated using data from any dual-frequency GNSS receiver. The differencing of the STEC values over time to estimate the ROT will remove any biases in the STEC measurements. The scintillation levels, as characterized by the scintillation indices and ROT rms , have been estimated only for satellites with elevation angle greater than 20° in order to remove the contribution from non-scintillation related errors such as multipath. The ranges of scintillation indices and ROT rms values used in the model development are 0-1 and 0-5, respectively. Figures 2 and 3 show the temporal variations in the ROT rms (top), σ φ (middle) and S4 (bottom) recorded by the 6 ISMRs used in this study on strong scintillation days, respectively, at the high and low latitudes. A good correlation between the scintillation indices and ROT rms can be observed from Figs. 2 and 3. During the scintillation periods, the ROT rms increases to values greater than 1 TECU/ min. The enhancements in ROT rms closely follow those of the scintillation indices, confirming that ROT rms could be used to represent the presence of ionospheric irregularities leading to the occurrence of scintillation. It can be observed from Figs. 2 and 3 that the ROT rms is a good indicator of phase scintillation at high latitudes, and of both amplitude and phase scintillation at low latitudes, a feature in agreement with that presented in Pi et al. (2013). These observed differences in correlation can be partially attributed to the physical processes causing plasma density irregularities leading to the generation of scintillation in the different latitude regions (Jiao and Morton 2015). The loss of lock events defined in this study was identified for each satellite-receiver link from the L1 lock time output by the ISMRs at every one-minute, indicating for how long the receiver has been successfully tracking the carrier phase on the L1 signal. A loss of lock was considered detected when the lock time was less than 60 s for visible satellites with elevation angle greater than 20°. In the development of the probability of loss of lock models, the entire dataset at each high and low latitude station was classified considering two subsets: (1) samples for which a loss of lock occurred within the 1 min window and (2) samples which are more than 4 min away from the next loss of lock event. The second subset is defined to account for the 4 min required by the receiver to settle after a loss of lock event (Van Dierendonck et al. 1993). These subsets were binned according to the scintillation levels, with a step size of 0.05 and 0.25 for the scintillation indices and ROT rms , respectively. The probability of loss of lock in each bin was computed as: where N lol is the number of samples in subset 1, and N is the total number of samples considering both subsets 1 and 2. In order to remove the contribution of bins with poor statistics, the probability for a bin was not calculated if the total number of samples was fewer than 8 in that bin. The distribution of the scintillation level values covered by the four year long data (2012)(2013)(2014)(2015) used in the model development is shown in Table 2 in Vadakke Veettil et al. (2018a). It is clear that the number of samples decreases with the increase in the scintillation levels and is 5 when ROT rms is in the range of 4.5-5. This is the reason for applying a threshold of 8 samples to calculate the probability.
The GDOP normalized 3D positioning error models have been developed based on Precise Point Positioning (PPP) solutions using GPS data provided by the 'GPSPACE' software available from NRCan (National Resources Canada) as an online tool (Donahue et al. 2015). PPP is a GNSS carrier phase based high accuracy positioning technique utilizing undifferenced dual-frequency code and phase observations (Malys and Jensen 1990;Zumberge et al. 1997;Kouba and Héroux 2001). The GPSPACE applies the ionospheric free linear combination from dual-frequency data, thus eliminating the first order ionospheric error. The GPS data were processed in kinematic mode considering a satellite elevation mask of 10°, a priori standard deviations of 2 m and 0.015 m, respectively, for pseudorange and carrier phase observations, final precise orbits and clocks from the International GNSS Service (IGS) and the tropospheric zenith delay estimated as a random walk process (5 mm/h). The 3D positioning error at every minute after PPP convergence was computed as the difference of the estimated coordinates from the known station coordinates used as ground truth. The accuracy of a PPP solution is also influenced by the satellite-receiver geometry measured by the DOP parameter. The DOP at each epoch is related to the observational geometric configuration between the satellites and the receiver, such that the positioning accuracy may be significantly degraded when a small number of satellites is available. To remove the effects of the satellite-receiver geometry on the positioning solution, the calculated 3D positioning errors were normalized by the GDOP to develop the statistical models. The scintillation levels as represented by the scintillation indices and ROT rms were averaged over each one-minute interval to match the time resolution of the GDOP normalized 3D positioning errors.
(1) Pr ob lol = N lol (> 8)∕N The user at a particular location and time needs to multiply the model output by the instantaneous GDOP at that location and time to calculate the absolute 3D positioning error.
The validation of the developed statistical models was carried out by adopting a data sacrifice strategy whereby selected data covering scintillation and non-scintillation days during different seasons has been left out during the development phase of the models. The criteria for categorizing days as scintillation or non-scintillation for the high and low latitudes was based on the temporal variations of σ φ and S4, respectively. The following criteria was applied to categorize the day as scintillation: (1) as this study focuses on moderate to strong levels of scintillation, the threshold for S4 and σ φ was set as 0.3; (2) the satellite elevation cut off was set as 20°; (3) the S4 or σ φ values remain above the threshold for more than 15 min; (4) for equatorial and low latitude stations, this 15-min interval refers to the post sunset period and for high latitudes, this 15-min interval refers to any time during the day. Days on which the S4 or σ φ values were less than 0.3 for the whole day were categorized as non-scintillation. In addition, to confirm the selection of scintillation and non-scintillation days manual visual inspection of the figures illustrating the temporal profiles of the S4 and σ φ was performed. The list of days used for the validation of the developed models at stations IQA, BRO, NYA, PRU, PAL and POA are shown in Table 3. It is to be noted that the days listed for IQA and PRU were chosen to validate the models at stations IQL and UFP equipped with Topcon NET-G3A and Trimble NETR9 receivers, respectively.
The goodness of fit of the probability of loss of lock models was evaluated by using the correlation coefficient, R and Root Mean Squared Error (RMSE) of the residuals. R is a measure of how well the model estimates the variability in the observations and a higher value for R indicates that the model represents the trend of the observations well. The RMSE is a measure of how accurately the model estimates the observations themselves and a low value of the RMSE indicates the closeness between the observations and their model estimations. The goodness of fit of the GDOP normalized 3D positioning error models was evaluated by using the RMSE of the residuals.

Results and discussions
The development and validation of the statistical models following the approach of nonlinear regression to estimate the probability of loss of lock at high and low latitudes are described in the following sub-section. A subsequent subsection describes the development and validation of the GDOP normalized 3D positioning error models.

Probability of loss of lock
The statistical models for high latitude were developed using the data recorded at the IQA station in the Canadian High Arctic Ionospheric Network (Jayachandran et al. 2009), incorporating the dependence on the scintillation levels. This station was chosen because it lies at the boundary of the auroral oval, where the occurrence of high latitude scintillation is frequent (Spogli et al. 2009). Using the statistical approach of nonlinear regression, the models developed for the high latitude have the following forms: where σ φ is the phase scintillation index, ROT rms is the rate of change of TEC index and erf is the standard error function. The top and bottom panels of Figure 4 show respectively the probability of loss of lock modeled as a function of σ φ and ROT rms . In this figure, the symbols indicate the observed values of probability, whereas the curve indicates the model estimated values. It can be observed from Fig. 4 that, as expected, when scintillation levels are low, very few loss of lock events are observed, and as the scintillation level increases, most samples are associated with a loss of lock. The values of σ φ and ROT rms corresponding to 50% probability of losing lock are 0.86 radians and 3.9 TECU/min, respectively. The results from the days listed in Table 3 for the validation of the models represented in (2) and (3) are summarized in Table 4. The table shows the average correlation coefficient R and RMSE along with the number of days used for validation.
To further assess the suitability of the models represented in (2) and (3), these models were also tested at the European high latitude stations of BRO and NYA, which did not contribute any data to the model development. The results from the validation days listed in Table 3 are shown in Table 5.
The data recorded at station PRU in Brazil was used to develop the statistical models for the low latitudes. This station was chosen because it lies close to the crest of the Equatorial Ionization Anomaly (EIA), where the occurrence of low latitude amplitude scintillation is strong and frequent (Basu et al. 2002). The models developed for the low latitudes have the following forms: where S4 is the amplitude scintillation index, ROT rms is the rate of change of TEC index, and erf is the standard error function. The top and bottom panels of Fig. 5 show, respectively, the probability of loss of lock modeled as a function of S4 and ROT rms . In this figure, the symbols and curve indicate, respectively, the observed and model estimated values of probability. It can be noted from Fig. 5 that the probability of losing lock is 23% for S4 and ROT rms values of 1 and 4, respectively. The results presented in Carrano and Groves (2010) showed that at the low latitude station of Ascension Island, an S4 value of 0.97 corresponds to 90% probability of losing lock for an Ashtech Z-XII receiver. This discrepancy in the S4 values may be attributed to the differences in tracking parameters and characteristics of the specific receivers considered in the two studies. The validation results of the models represented in (4) and (5) for the station PRU on the days listed in Table 3 are summarized in Table 6.
Similarly, to the high latitude models, the suitability of the models represented in (4) and (5) was also tested at the Brazilian low latitude stations POA and PAL, which did not contribute any data to the model development. Table 7 summarizes the validation results for these two stations for the days listed in Table 3.
The validation results presented in Tables 4, 5, 6 and 7 indicate that both the developed high and low latitude models perform well under scintillation and non-scintillation conditions. The average values of the correlation coefficient, R are mostly above 0.7, indicating a good correlation between the observed trend and the model estimated trend. The average RMSE values are below 0.35, indicating closeness between the observed and the model estimated values. These results indicate a good performance of the developed models.

(5)
Prob lol lowlat ROT (%) = 50. 1 + erf (ROT rms − 6.658 ∕4.869 ] The following sub-section describes the development and validation of the 3D positioning error models at high and low latitudes as a function of the scintillation levels.

GDOP normalized 3D positioning error
Similarly to the probability of loss of lock models, the statistical models to estimate the 3D positioning errors at high and low latitudes were developed following the approach of nonlinear regression using data recorded at the IQA and PRU stations, respectively. The models developed for the high and low latitudes have the forms represented, respectively, in equations (6), (7) and (8) where the 3Dposerr norm is the GDOP normalized 3D positioning errors, and the scintillation levels are represented by σ φ , S4 and ROT rms . It is clear from the above equations that the 3D positioning errors at high and low latitudes increase exponentially with increasing scintillation levels. This is in agreement with the results presented in Jacobsen and Dähnn (2014), where they have shown that for high latitude receivers, located above 64°N, there is a strong positive correlation between 3D position errors and Rate of TEC index (ROTI), with the positioning errors increasing exponentially with increasing ROTI.
Sample results showing the validation of the models represented in (6) and (8)   defined as the difference between the observed and the model estimated positioning errors, along with the mean and the standard deviation. It is worth noticing that the distribution of the residuals is well centered close to zero, with a mean = 0.0084 m in Fig. 6 and −0.0367 m in Fig. 7, i.e., the model estimation is accurate and that STD in Figs. 6 and 7, respectively, is 0.0285 m and 0.0364 m, providing the precision of the model estimations. The RMSE of the residuals in Figs. 6 and 7 is 0.0297 m and 0.0517 m, respectively. The low values for these statistical metrics indicate that the models were able to accurately estimate the values of the actual observed errors. The variations of the σ φ and S4 scintillation indices are shown by blue lines in Figs. 6 and 7, respectively. More importantly, it is possible to see, especially from Fig. 7 that the PPP solution has a convergence time of around 45 min, longer than expected, which is due to the effect of strong scintillation. It is to be noted that in the absence of scintillation, an accuracy at the level of a few cms is expected in the estimated 3D position components after the initial convergence period of about 20 min. Nevertheless, it can also be observed that during that period the model retrieves reasonably well the actual errors, confirming the ability of the model to estimate the effects of scintillation on the positioning solution properly. The results from the days selected using the data sacrifice strategy for the validation of the high and low latitude models represented in (6), (7), (8) and (9) are summarized in Tables 8 and 9, respectively. It can be observed from Tables 8 and 9 that the average RMSE values are all below 10 cm, which is within the expected positioning accuracy from an epoch-wise PPP solution (Bisnath and Gao 2009), thus suggesting a good performance of the developed models. Additionally, from Tables 8 and 9 it can be observed that the 3D positioning error models developed for the high and low latitudes as a function of ROT rms i.e., (7) and (9), were validated using data recorded at stations IQL and UFP, respectively. It was not possible to validate the performance of the developed models based on the scintillation indices, i.e., (6) and (8), as the stations IQL and UFP are, respectively, equipped with a Topcon NET-G3A and Trimble NETR9 receiver, which do not output the scintillation indices. On comparing the model performance during the scintillation and non-scintillation days at high and low latitudes, it can be observed from Tables 8 and 9 that although the average RMSE values at IQL and UFP stations are smaller than 10 cm, these are higher as compared to the other stations located in the same latitudinal sector. The average RMSE value during the scintillation days for the two high latitude stations IQL and IQA, located very close to each other, is 0.0854 m and 0.0332 m, respectively. Similarly, for the two closely located low latitude stations UFP and PRU the average RMSE values for the scintillation days are 0.0757 m and 0.0586 m, respectively. This suggests that the developed models presented here can provide a lower bound for positioning errors experienced by conventional receivers such as Trimble and Topcon.

Conclusion
The development and validation of statistical models to estimate the probability of loss of lock and GDOP normalized 3D positioning error developed under the EC funded IPS project have been presented. The statistical technique of nonlinear regression was applied to develop the models as a function of the scintillation levels represented by the phase and amplitude scintillation indices and the ROT rms . Long-term data collected by ISMRs in operation at the high and low latitudes during the years of 2012-2015 was exploited to develop these models. The validation of the developed models was carried out at different stations covering a range of latitudes and equipped mostly with ISMRs, which yielded results that indicate their suitability to estimate the probability of satellite loss of lock and the 3D positioning errors over that range of latitudes. The validation results for the probability of loss of lock models at the high and low latitude stations indicated average R values above 0.7 and average RMSE values below 0.35, thus suggesting that the developed models perform well under scintillation and non-scintillation conditions. In the case of the 3D positioning error models, the validation results at the high and low latitudes indicated average RMSE values below 10 cm, suggesting a good performance of the models. Although the statistical model development and validation has been carried out using data from a limited number of ISMRs and conventional receivers, the encouraging results obtained indicate that these statistical models can be used to provide meaningful information to GNSS users under scintillation conditions. RCH/15/8381]. This work was supported by the Engineering and Physical Sciences Research Council [Grant Number EP/H003479/1], data from BRO station are from a network of GNSS receivers established through the EPSRC research grant. Data from the PRU, POA and PAL stations in Brazil are part of the CIGALA/CALIBRA network. Monitoring stations from this network were deployed in the context of the Projects CIGALA and CALIBRA, both funded by the EC in the framework of the FP7-GALILEO-2009-GSA and FP7-GALILEO-2011-GSA-1a, respectively, and FAPESP Project Number 06/04008-2. Data from NYA station are from the network maintained by INGV. Authors thank Dr. Vincenzo Romano, Dr. Giorgiana De Franceschi, Dr. Lucilla Alfonsi and Dr. Ingrid Hunstad for the valuable support in maintaining the infrastructure and ensuring the quality of the NYA data.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.