Skip to main content

Advertisement

Log in

Empirical Data Assimilation for Merging Total Electron Content Data with Empirical and Physical Models

  • Published:
Surveys in Geophysics Aims and scope Submit manuscript

Abstract

An accurate estimation of ionospheric variables such as the total electron content (TEC) is important for many space weather, communication, and satellite geodetic applications. Empirical and physics-based models are often used to determine TEC in these applications. However, it is known that these models cannot reproduce all ionospheric variability due to various reasons such as their simplified model structure, coarse sampling of their inputs, and dependencies to the calibration period. Bayesian-based data assimilation (DA) techniques are often used for improving these model’s performance, but their computational cost is considerably large. In this study, first, we review the available DA techniques for upper atmosphere data assimilation. Then, we will present an empirical decomposition-based data assimilation (DDA), based on the principal component analysis and the ensemble Kalman filter. DDA considerably reduces the computational complexity of previous DA implementations. Its performance is demonstrated by updating the empirical orthogonal functions of the empirical NeQuick and the physics-based TIEGCM models using the rapid global ionosphere map (GIM) TEC products as observation. The new models, respectively, called ‘DDA-NeQuick’ and ‘DDA-TIEGCM,’ are then used to predict TEC values for the next day. Comparisons of the TEC forecasts with the final GIM TEC products (that are available after 11 days) represent an average \(42.46\%\) and \(31.89\%\) root mean squared error (RMSE) reduction during our test period, September 2017.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

Download references

Acknowledgements

The authors would like to acknowledge the TEC estimates from IGS product https://cddis.nasa.gov/, which were freely available to us. The source codes for the simulation models used in this study, the NeQuick and TIEGCM, are freely available at https://t-ict4d.ictp.it/nequick2 and https://www.hao.ucar.edu/modelling/tgcm/, respectively.

Funding

E. Forootan acknowledges the financial support by the Danmarks Frie Forskningsfond [10.46540/2035-00247B].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saeed Farzaneh.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest with respect to this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Evaluation measures

Appendix: Evaluation measures

To numerically evaluate the performance of the original and DDA model compared to the observation, the following statistical measures are applied:

  • ‘Bias’ is defined as:

    $$\begin{aligned} \text {Bias}=\frac{1}{n}\sum _{i=1}^{n}(\text {Obs}_{i}-\text {Model}_{i}), \end{aligned}$$
    (27)

    where \(\text {Obs}\) and \(\text {Model}\) denote observation and model estimates, receptively, and n is the number of observations. The positive (negative) values of the bias demonstrate that the model underestimates (overestimates) compared to the observations.

  • The expression of bias in percentage is computed based on the ‘relative error (RE)’ as:

    $$\begin{aligned} \text {RE}=100 \times \sum _{i=1}^{n}\left( \frac{|\text {Obs}_{i}-\text {Model}_{i}|}{\text {Obs}_{i}}\right) , \end{aligned}$$
    (28)

    where \(|. |\) represents an operator that returns the absolute values.

  • Standard deviation (STD) determines the dispersion of a dataset relative to its mean and is calculated as:

    $$\begin{aligned} \text {STD}=\sqrt{\frac{\sum _{i=1}^{n}(\text {Obs}_{i}-\bar{\text {Obs}})^{2}}{n}} \end{aligned}$$
    (29)
  • ‘Root mean squared error (RMSE)’ is determined to assess how model estimates match with observations as:

    $$\begin{aligned} \text {RMSE}=\sqrt{\frac{\sum _{i=1}^{n}(\text {Obs}_{i}-\text {Model}_{i})^{2}}{n}} \end{aligned}$$
    (30)

    The squared term inside the RMSE equation highlights both positive and negative differences between the quantities.

  • ‘Improvement’ is defined as percentage in the computed RMSEs after implementing DDA as:

    $$\begin{aligned} \text {Improvement}=100\times \frac{\text {RMSE}_1-\text {RMSE}_2}{\text {RMSE}_1}, \end{aligned}$$
    (31)

    where \(\text {RMSE}_1\) is computed between the original NeQuick or TIEGCM and GIM-VTECs and \(\text {RMSE}_2\) is determined between those of DDA and GIM-VTECs.

  • ‘Average of absolute percentage deviation (AAPD)’ is expressed as the percentage of absolute difference between observation and model as:

    $$\begin{aligned} \text {AAPD}=100\times \frac{\sum _{i=1}^{n}\left( |\frac{\text {Obs}_{i}-\text {Model}_{i}}{\text {Obs}_{i}}|\right) }{n}. \end{aligned}$$
    (32)

    Minimum (maximum) values of AAPD correspond to the average best (worst) performance of a model in estimating VTECs.

  • ‘Fit’ is determined as the fraction of data variance that is predicted by the model as:

    $$\begin{aligned} \text {Fit}=1-\frac{\sqrt{\sum _{i=1}^{n}(\text {Obs}_{i}-\text {Model}_{i})^{2}}}{\sqrt{\sum _{i=1}^{n}(\text {Obs}_{i}-\bar{\text {Obs}})^{2}}}, \end{aligned}$$
    (33)

    where \(\bar{\text {Obs}}\) is defined as the mean of observations. In contrast to AAPD, the minimum (maximum) values of fitting correspond to the average worst (best) performance of model in simulating VTECs.

  • ‘Correlation coefficients (CCs)’ are used as a unit-less measure to represent the overall agreement between model estimations and observations:

    $$\begin{aligned} \text {CC}= \frac{\sum _{i=1}^{n}{(\text {Model}_{i}-\bar{\text {Model}})(\text {Obs}_{i}-\bar{\text {Obs}})}}{\sqrt{\sum _{i=1}^{n}{(\text {Model}_{i}-\bar{\text {Model}})^{2}}\sum {(\text {Obs}_{i}-\bar{\text {Obs}})^{2}}}}. \end{aligned}$$
    (34)

    The range of CCs is from \(-1\) to \(+1\), where \(-1\) indicates the perfect negative correlation, \(+1\) corresponds to the 100\(\%\) correspondence, and zero indicates no correlations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Forootan, E., Kosary, M., Farzaneh, S. et al. Empirical Data Assimilation for Merging Total Electron Content Data with Empirical and Physical Models. Surv Geophys 44, 2011–2041 (2023). https://doi.org/10.1007/s10712-023-09788-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10712-023-09788-7

Keywords

Navigation