Efficiency of artificial neural networks in map of total electron content over Iran

Maps of the total electron content (TEC) of the ionosphere can be reconstructed using data extracted from global positioning system (GPS) signals. For historic and other sparse data sets, the reconstruction of TEC images is often performed using multivariate interpolation techniques. In this paper, a quantitative comparison of the ability of artificial neural networks (ANN), polynomial fitting and kriging interpolation was carried out in order to model the spatial variations of TEC using GPS data over Iran. These methods are suitable for handling multi-scale phenomena and unevenly distributed data. The observations collected at 25 GPS stations from Iranian permanent GPS network (uniformly spread all over Iran with sampling rate of 30-seconds). Dual frequency carrier phase and code GPS observations were used. A smoothed TEC approach was used for absolute TEC recovery. Evaluation of the methods has been applied with single GPS station in Tehran equipped with ionosonde instrument. The minimum relative error for ANN, polynomial and kriging are 4.37, 6.35, 9.13 % and the maximum relative error are 8.61, 29.06, and 20.14 % respectively. Also root mean square error (RMSE) of 3.7 TECU is computed for ANN method which is less than RMSE of other mentioned methods. The results show that ANN method has higher accuracy and compiles speed than kriging and polynomial. As well as, it is found that polynomial and kriging methods required many computational points in adjustment step.


Introduction
When the molecules and atoms of the atmosphere receive enough external energy, one or more electrons are dissociated from the molecules or atoms. This process is called ionization. The solar ultraviolet (EUV) radiation and particle precipitation are the two primary energy sources in the ionization (Schunk and Nagy 2000). Also cosmic radiation contributes to this ionization. This layer of atmosphere is called ionosphere. The ionosphere is that part of the atmosphere in which the number of free electrons is so high that, it significantly affects the propagation of radio waves. Ionospheric refraction is one of the main error sources on GPS signals. This effect is proportional to the total electron content (TEC). TEC is a projection of electron density along signal path extending from the satellite to the receiver on the ground. The unit of TEC is TECU and 1 TECU equals 10 16 electrons/m 2 (Seeber 2003). Production of free electrons in the ionosphere depends on many factors, such as solar, geomagnetic, gravitational and seismic activity period.
There are many methods to obtain electron density or TEC profiles and predictions. In early time, direct measurements such as ionosonde was a kind of effective instrument to achieve this purpose (Kelley 1989). Later, some empirical and mathematical models were developed. For example, international reference ionosphere (IRI) model, the parameterized ionospheric model (PIM) are empirical models (Klobuchar 1975;Schaer 1999). Mathematical models divided to two categories: single-layer (2-D) and multi-layer (3-D and 4-D).
The existing 2-D methods of modeling the electron density can be classified to non-grid based and grid based techniques (El-Arini et al. 1995). The former modeling techniques are based on the least squares estimation of a functional model for certain types of observables derived from the GPS carrier phase and code measurements. Polynomials and spherical harmonics are some of the base functions that are commonly in use (Komjathy and Langley 1996;Schaer 1999). In grid based modeling, the spherical shell of free electrons is developed into a grid of rectangular elements. Special reconstruction algorithms are then used for estimating the electron density within the every element of the shell (El-Arini et al. 1995;Gao et al. 2002;Skone 1998;Liu and Gao 2003). Neglecting the vertical gradient of the electron density is the main deficiency of the two dimensional modeling techniques.
In the past decades, three and four dimensional models have been developed such as: 3-D tomographic models (Howe et al. 1998;Hansen et al. 1997;Hernandez-Pajares et al. 1999;Colombo et al. 1999;Liu 2004) and 4-D models (Zeilhofer et al. 2009;Nohutcu et al. 2010;Amerian et al. 2010). These methods have two main disadvantages: first, the observation used to reconstruct electron density is usually limited in time and space, so electron density cannot be obtained in any time and space. Second, due to the sparse distributions of GPS stations and lack of observations, ionosphere tomography is an inverse ill-posed problem (Yao et al. 2014). These limitations led to use interpolation and extrapolation techniques to estimate the ionosphere electron density. Once the TEC is known, it is possible to determine the ionospheric delay on the GPS signals. As the ionosphere is dispersive; the delay is function of signal frequency. Using dual frequency GPS receivers, electron content of the ionosphere is known. With the help of two frequencies, it is possible to compute the total electron content of the ionosphere in an arbitrary station.
One of the traditional methods in single-frequency GPS receivers to eliminate the ionospheric delay is using conventional models (Coster et al. 2003;EL-Arini et al. 1993, 1994, 1995Skone 1998;Komjathy and Langley 1996;Liu and Gao 2003). With the creation of local and regional networks, it is possible to acquire TEC in regular ionospheric grids. Using the regular ionospheric grids, the prediction of TEC in other parts of network is possible. Once the TEC is predicted, it is possible to correct ionospheric refraction in single frequency GPS receiver. So far, several different interpolation methods are used to predict TEC values. Spherical harmonics, Spline interpolation, Gaussian processes are some of the examples used methods to predict TEC values for the locations where physical data are not exist (Moon 2004;Schaer 1999;Sayin et al. 2008;Hirooka et al. 2011).
Recently it has become clear that the techniques derived from artificial intelligence research and modern computer science provide a number of system aids to analyze and predict the behavior of complex solar-terrestrial dynamic systems (Cander 1998). Methods of artificial intelligence have provided tools which potentially make the task of ionospheric modeling possible. ANN provides an inexplicit non-linear model to learn relations between inputs and outputs using training data (Cander 1998).
Due to the nonlinearity of ionosphere physical properties, in this paper, we use multilayer perceptron artificial neural networks (MLP-ANN) to model and predict the spatial variations of TEC over Iran. The used model is able to estimate and predict the TEC within and also near the network. Also for evaluating and comparing of ANN ability in TEC modeling and predicting, two other methods are implemented. Kriging interpolation method and polynomial fitting with 10 coefficients are used for TEC prediction. In both cases, least squares method is used for coefficients estimation.
This paper includes the following sections: in Sect. 2 computation of the TEC from dual frequency GPS receivers is explained. In Sect. 3, a brief explanation of artificial neural networks is provided. The back-propagation algorithm and its training are described in more details. Section 4, explains kriging method and its formulation. In Sect. 5, polynomial fitting and coefficients estimation is explained. The study data and obtained results with their corresponding analysis are presented in Sect. 6. Finally in Sect. 7, advantages and disadvantages of this type of modeling are discussed.

Observations
Dual frequency GPS receivers provide carrier phase U i (i = 1, 2) and code P i (i = 1, 2) observations on L-band (L 1 , L 2 ) frequencies (Seeber 2003): In which: where q is the geometric distance between receiver and a satellite (m), c is the speed of light (m/s), dt is the receiver clock error with respect to GPS time (s), dT is the satellite clock error with respect to GPS time (s), d orb is the satellite orbit error (m), k i is the wavelength of the GPS signal on L i frequency, N i is the carrier phase integer ambiguity (cycle), d trop is the troposphere delay (m), I is the ionospheric delay (m), d mult is the multipath effect (m), b p1 and b p2 are the satellite hardware delay (m) on code pseudorange measurements, b U1 and b U2 are the satellite hardware delay (m) on phase measurements, B p1 and B p2 is the receiver hardware delay (m) on code pseudorange measurements, B U1 and B U2 is the receiver hardware delay (m) on phase measurements, e is the measurement noise (m) and f is signal frequency. If no cycle slip occurs during the measurements in N successive epochs, the recursive equation to calculate the mean DTEC N at epoch N is given as below (Skone 1998): In Eq. (6) DTEC n represent the difference of code TEC (TEC R,n ) and phase TEC (TEC U,n ) at an arbitrary epoch n. The smoothed absolute TEC SM at epoch N is expressed as TEC SM,N and it can be calculated by (Liu 2004): TEC in zenith (VTEC) can be calculated as follows: In which: In Eq. (9), 'E' is satellite elevation angle. The VTEC value obtained from Eq. (8) can be used as output training data of ANN, kriging and polynomial.

Artificial neural network (ANN)
Neural network is an information processing system which is formed by a large number of simple processing elements, known as artificial nerves (Haykin 1994). It is formed by a number of nodes and weights connecting the nodes (Stanley 1990). The input data are multiplied by the corresponding weight and the summation are entered into neurons. Each neuron has an activation function. Inputs pass to the activation function and determine the output of neurons. The number of neurons and layers could be obtained through trial and error according to a specific problem (Simpson 1990). Using training data, the designed ANN can be adjusted in an iterative procedure to determine optimal parameters of ANN.

Multi-layer perceptron neural network (MLP-NN)
One of the simplest and effective methods to use in modeling of real neurons is multi-layer perceptron neural network. This model has been established of one input layer, one or more hidden layers and one output layer. In this structure, all the neurons in one layer are connected to all neurons of the next layer. This arrangement is commonly called a network with full connectivity (Mars et al. 1996). Neuron numbers in each layer is determined independently. Figure 1 shows the scheme of a three-layer perceptron network.
Processing on the input parameters of the neural network can be done using the following function (Norgaard 1997): where y k is the neuron output, f is the activation function, m is the number of input parameters, x i is the i-th input parameter, w ki is the i-th synaptic weight and b k is the bias. Sigmoid activation function can be defined as follows (Haykin 1999): An important issue in multi-layer artificial neural networks is the number of neurons. The neurons of input and output layers are determined according to the number of input and output parameters. The number of neurons in the hidden layer can be determined by trial and error through minimizing total error of the ANN. For this minimization, each ANN parameter's share in the total error should be computed which can be achieved by a back-propagating algorithm (Mars et al. 1996).

Back-propagation algorithm (BPA)
In the general case, learning of ANNs falls in two categories: fixed-weight and variable weights (learning network). Learning networks are divided into supervised and unsupervised (Rumelhart et al. 1986). In supervised networks, training step is done using specified data that their values are pre-determined (Mars et al. 1996). Two ways is feasible for feeding input parameters to the neural network: batch mode and pattern mode (Mars et al. 1996). Usually in a multi-layer perceptron, pattern mode is used. For training of the network and modifications of the weights, there are so many ways. One of the most famous and simplest methods is back-propagation algorithm which trains network in two stages: feed-forward and feed-backward (Mars et al. 1996). In feed-forward process, input parameters move to output layer. In this stage, output parameters are compared with known parameters and the errors is identified. The next stage is done feed-backward. In this stage, the errors move from output layer to input layer. Again, the input weights are calculated. These two stages are repeated until the errors reaches a threshold expected for output parameters. In this paper back-propagation algorithm is used for training and calculating VTEC. The input of neural network is latitude and longitude of the GPS stations and output parameter is VTEC. Usually input data is separated into three groups: training, testing and validation data. The training set is used for instruction and determines the weight of neurons. In this research results were analyzed by assuming both absolute and relative errors. The absolute errors can be computed according to: where VTE k is the computed value of VTEC, in TECU, and VTEC e is the estimated value of VTEC, in TECU. The relative errors can be computed according to: The less the absolute and relative errors are [as given by Eqs. (12) and (13)], the closer are the predicted VTEC e (given by our neural network model) and the computed VTEC (determined from dual frequency receivers) used as reference. Due to the direct relationship between TEC and ionospheric delay, we can correct the ionospheric delay with a similar accuracy of the estimated TEC. The results of VTEC estimations can be regarded as an estimated accuracy for correcting the ionospheric delay to single frequency receivers.

Kriging interpolation
Kriging is probably the most widely used technique in geostatistics to interpolate data. It was formalized in the sixties by a French engineer George Matheron (1963) after the empirical work of Danie G. Krige (1951). Kriging interpolation is a two-step process: first a regression function f(x) is constructed based on the data and a gaussian process Z is constructed through the residuals (Coukuyt et al. 2013): where f(x) is a regression function and Z is a gaussian process with mean 0, variance r 2 and a correlation matrix w. Depending on the form of the regression function, kriging has been prefixed with different names. Simple kriging assumes the regression function to be a known constant, f(x) = 0. A more popular version is ordinary kriging, which assumes a constant but unknown regression function f(x) = a 0 . In universal kriging, more complex trend functions such as linear or quadratic polynomials are used (Coukuyt et al. 2013): where b i (x) are i = 1 … p basis functions and a = (a 1 ,…,a p ) denotes the coefficients. The idea is that the regression function captures the largest variance in the data and that the gaussian process interpolates the residuals. In fact, the regression function f(x) is actually the mean of the broader gaussian process Y. However, selecting the correct regression function is a difficult problem, hence; the regression function is often chosen to be constant. Consider a set of n samples, X = (x 1 ,…,x n ) in d dimensions and associated function values; Y = (y 1 ,…,y n ) essentially, the regression part is encoded in the n 9 p model matrix F: While the stochastic process is mostly defined by the n 9 n correlation matrix w: In which w(x n ,x n ) is the correlation function. w(x n ,x n ) is parameterized by a set of hyper parameters h which are identified by maximum likelihood estimation (MLE). Subsequently, the prediction mean and prediction variance of kriging are derived respectively, as: where y is a p 9 1 vector denoting the coefficients of the regression function, determined by generalized least squares (GLS), and r(x) = (w(x,x 1 ),…, w(x,x n )) is an 1 9 n vector of correlation between the point x and the sample X.

Polynomial fitting
As a classical approximation model, the 3D polynomial fitting technique is used to generate TEC as a function of coordinates. In this research, to assess the accuracy of ANN and kriging results, the electron content of the ionosphere is modeled with 3-order polynomial with 10 coefficients. In the general case, this polynomial is: where a ij shows polynomial coefficients and n is order of polynomial. Also (x,y) are considered coordinates of points used to determine the coefficients of the polynomial. We assume that function F(h, k) represents the ideal ionospheric model and is the non-linear function of variables latitude h and longitude k (Ouyang 2004): where F(h, k) is the VTEC with regard to latitude h and longitude k. Given F(h, k) = VTEC(h, k) can be defined as follows: Rewriting Eq. (22) as: In this study, the same training data are used for training of ANN, kriging and polynomial. Polynomials with different degrees and terms have been evaluated. Accuracy of modeling is defined by differences between true values and estimated values from polynomial. Polynomials from order 2 to 4 were tested; each fitted using a least squares technique with 21 input points. Due to limited input dataset, higher order polynomials are not tested. Even in the availability of such data, normal equations matrices would be illcondition in case of high order polynomials. The 3th order polynomial was therefore the best fitting model. The necessary number of points to calculate a two variables polynomial is: Fig. 2 Spatial distribution of the GPS stations used in this study (triangles indicate stations used in training, the circles indicate the stations used in assessment of neural network and a square indicate a station used in testing) where r is the number of order of the polynomial and p is the number of the necessary points. Therefore for this problem, at least 12 points are needed to solve the polynomial. By using 21 points we can solve the problems with some degree of freedom.

Analysis of results
Iran Geodynamic studies started since 1998 to monitor the variations in the earth's crust and tectonic movements. Permanent GPS network was designed and implemented gradually in 2004 to investigate the mechanisms of active faults in Iran. This network currently has 120 permanent GPS stations in the initial phase. Average distance between dense parts is about 25-30 km. From these 120 stations, 25 stations are selected for modeling ionospheric electron content over Iran since January 3, 2007. Figure 2 shows the spatial distribution of these stations. In this figure, black triangles indicate stations used in training, the green circles indicate the stations used in assessment of neural network and a blue square indicates a station used in testing. GPS stations used in this study are divided into three groups: 21 stations are used for training, 3 stations are used for validation and one station is used for testing the results. Only the training set participates in the learning, the validation set is used to avoid overfitting and the testing set is used to compute prediction error, which approximates the generalization error. After training process, the models were used to estimate the VTEC value for the test station. This value is then compared with the known VTEC value obtained with the techniques explained in Sect. 2. The difference between them shows the prediction error of the models. Using these techniques, we could analyze the performance of the models for predictions inside and at the edges of the area covered by the network. Table 1 shows the testing results for three interpolation methods in Tehran station. Two different times in the day and another couple of times at night are selected for analyzing the ionospheric behavior.
According to the results in Table 1, the minimum relative error obtained by ANN is 4.37 %, by kriging is 9.13 %, and by polynomial is 6.35 % and maximum relative error is 8.61, 20.14 and 29.06 % respectively. These results indicate the high potential of multilayer ANNs in TEC prediction than kriging and polynomial model in accuracy and computational time. Processing in this research illustrate that the kriging and polynomial model require numerous computational points in the adjustment stage. This is the limitation of these methods in TEC interpolations. Figure 3 shows the observed and predicted TEC in test station for the mentioned selected times.
To analyze the accuracy of the mentioned methods in TEC prediction, all cases compare with GPS TEC. Figures 4, 5 and 6 shows the scatter plot for GPS TEC with corresponding TEC predictions from the ANN, kriging and polynomial models over test station in January 3, 2007, with lines of best fit inserted for all cases as well as, prediction bond in 95 % confidence interval. Correlation coefficients give reliability levels of the ANN, kriging and polynomial models to predict GPS TEC. Also in all these figures the residuals of predicted TEC is shown.
From the considered times, GPS TEC is highly correlated to ANN TEC at 08:00 UT with a correlation coefficient (R) of 0.987 and lowest correlated at 02:00 UT with a correlation coefficient (R) of 0.873. Figure 5 shows the scatter plot for GPS TEC with corresponding TEC predictions from the kriging method also prediction bond in 95 % confidence interval and residuals of predicted TEC. Using Fig. 5 and from the considered times, GPS TEC is highly correlated to kriging TEC at 14:00 UT with a correlation coefficient (R) of 0.954 and lowest correlated at 22:00 UT with a correlation coefficient (R) of 0.868. Figures 6 shows the scatter plot for GPS TEC with corresponding TEC predictions from the polynomial method also prediction bond in 95 % confidence interval and residuals of predicted TEC.
In Fig. 6 and from the considered times, GPS TEC is highly correlated to polynomial TEC at 08:00 UT with a correlation coefficient (R) of 0.959 and lowest correlated at 22:00 UT with a correlation coefficient (R) of 0.788. These results showed that the ANN model predicts TEC more accurately than the kriging and polynomial models.
After evaluating the accuracy of the results obtained from trained ANN, kriging and polynomial, VTEC values would be predicted at different locations and times. Figure 7 shows the horizontal variations of the VTEC for selected times over Iran by developed methods (in 10 16 ele/m -2 ). All results are compared with international GNSS service (IGS) VTEC. These comparisons indicate that the ANN method has higher correspondence with IGS product for VTEC. According to Fig. 7, it can be easily deduced that there are temporal and spatial variations in the electron content of the ionosphere. It is seen that the VTEC reaches its maximum value at 08:00 UT. The characteristics which are the constituents of the ionosphere morphology are also reported elsewhere (Liu and Gao 2003;Yizengaw et al. 2007) and confirmed by the analysis of the direct measurement techniques.

Conclusion and future research
In this study, the artificial neural network multilayer perceptron (ANN-MLP), kriging interpolation and polynomial fitting were used to model and predict the ionosphere electron content. The average relative error of test station obtained from ANN-MLP, kriging, and polynomial are 6.135, 13.835, and 15.045 % respectively. In other words, ANN-MLP with accuracy of about *93 % seems to be the most efficient algorithm among others to model and predict the electron content of the ionosphere inside and at the edges of the area covered by the network. This value can be accurately modeled VTEC for positioning with single frequency GPS receivers. Also the scatter plot for GPS TEC with corresponding TEC predictions from three methods computed, as well as prediction bond in 95 % confidence interval and residuals of predicted TEC. In this case, GPS TEC is highly correlated to ANN TEC with a correlation coefficient (R) of 0.987 and lowest correlated with a correlation coefficient of 0.788 in polynomial TEC. Each of these methods has advantages and disadvantages. One of the benefits of TEC modeling with ANNs can be its simplicity and computational speed. It should be noted that in case of insufficient training data, it would lead to unreliable results. The great advantage of polynomial method is the possibility of driving new features which is due to its analytical form. Disadvantage of this model is overfitting problem that can occur in the case of higher order polynomials. Even though the spacing of tracking stations of the network used in this research is sparse, the model produced good predictions. By increasing number of stations, more accurate results are expected. As future works, more testing stations and other dataset during geomagnetic and solar activity can used and analyzed. Also developing of the proposed method for global TEC problem would be another interesting subject.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.