Skip to main content
Log in

An imputation method for the climatic data with strong seasonality and spatial correlation

  • Original Paper
  • Published:
Theoretical and Applied Climatology Aims and scope Submit manuscript

Abstract

Missing data were frequently found in the instrumental climatic records, which hindered the statistical analyses on climate change. A novel imputation method, called Imputation Based on Decomposition of Time Series (IBDTS), was developed in this article for the climatic data with strong seasonality and spatial correlation. It was to decompose the time series into three components first, and then to predict the missing values in each component. The trend component was predicted by regression analysis, the seasonal component was predicted by spectral analysis, and the remainder component was predicted by spatial interpolation. The IBDTS imputation method showed relatively small errors in performance, and kept the real attributes of climatic series, including the amplitude and phase with the cycle period of 12 months, and the linear trend. The sensibility to station distance for the IBDTS method was relatively small. In addition, the IBDTS method had the ability to deal with the data with none of or only a few of complete series, and it was possible to be applied not only in the field of climatology but also in other fields as long as the data had the intrinsic properties of strong seasonality and spatial correlation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Alessio SM (2016) Digital signal processing and spectral analysis for scientists: concepts and applications. Springer, New York

  • Atabay D (2016) Pyrenn: first release (version v0.1). Zenodo. https://doi.org/10.5281/zenodo.45022

  • Bindoff NL, Stott PA, AchutaRao KM, Allen MR, Gillett N, Gutzler D, Hansingo K, Hegerl G, Hu Y, Jain S. Mokhov II, Overland J, Perlwitz J, Sebbari R, Zhang X (2013) Detection and attribution of climate change: from global to regional. In: Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change [Stocker TF, Qin D, Plattner G-K, Tignor M, Allen SK, Boschung J, Nauels A, Xia Y, Bex V, Midgley PM (eds)]. Cambridge University Press, Cambridge, and New York

  • Brönnimann S, Brugnara Y, Allan RJ, Brunet M, Compo GP, Crouthamel RI, Jones PD, Jourdain S, Luterbacher J, Siegmund P, Valente MA, Wilkinson CW (2018) A roadmap to climate data rescue services. Geosci Data J 5:28–39. https://doi.org/10.1002/gdj3.56

    Article  Google Scholar 

  • Broyden CG (1970) The convergence of a class of double-rank minimization algorithms. IMA J Appl Math 6:76–90. https://doi.org/10.1093/imamat/6.1.76

    Article  Google Scholar 

  • Cao L, Zhu Y, Tang G, Yuan F, Yan Z (2016) Climatic warming in China according to a homogenized data set from 2419 stations. Int J Climatol 36:4384–4392. https://doi.org/10.1002/joc.4639

  • Delaunay B (1934) Sur la sphère vide. A la mémoire de Georges Voronoï. Bulletin de l’Académie des Sciences de l’URSS 6:793–800

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39:1–38

  • Deng Q, Fu Z (2019) Comparison of methods for extracting annual cycle with changing amplitude in climate series. Clim Dyn 52:5059–5070. https://doi.org/10.1007/s00382-018-4432-8

    Article  Google Scholar 

  • Deng Q, Nian D, Fu Z (2018) The impact of inter-annual variability of annual cycle on long-term persistence of surface air temperature in long historical records. Clim Dyn 50:1091–1100. https://doi.org/10.1007/s00382-017-3662-5

    Article  Google Scholar 

  • Dobesch H, Dumolard P, Dyras I (eds) (2007) Spatial interpolation for climate data: the use of GIS in climatology and meterology. ISTE, London

    Google Scholar 

  • Domonkos P, Coll J (2019) Impact of missing data on the efficiency of homogenisation: experiments with ACMANTv3. Theor Appl Climatol 136:287–299. https://doi.org/10.1007/s00704-018-2488-3

    Article  Google Scholar 

  • Du Z, Wang Z, Wu S, Zhang F, Liu R (2020) Geographically neural network weighted regression for the accurate estimation of spatial non-stationarity. Int J Geogr Inf Sci 34:1353–1377. https://doi.org/10.1080/13658816.2019.1707834

  • Fischer MM, Getis A (eds) (2010) Handbook of applied spatial analysis: software tools, methods and applications. Springer, Heidelberg

  • Fletcher R (1970) A new approach to variable metric algorithms. Comput J 13:317–322. https://doi.org/10.1093/comjnl/13.3.317

    Article  Google Scholar 

  • Fletcher R (1987) Practical methods of optimization, 2nd edn. Wiley, New York

  • Ford BL (1983) An overview of hot-deck procedures. Incomplete Data in Sample Surveys. 2:185–207

  • García-Laencina PJ, Sancho-Gómez J-L, Figueiras-Vidal AR (2010) Pattern classification with missing data: a review. Neural Comput & Applic 19:263–282. https://doi.org/10.1007/s00521-009-0295-6

    Article  Google Scholar 

  • Goldfarb D (1970) A family of variable-metric methods derived by variational means. Math Comput 24:23–26. https://doi.org/10.1090/S0025-5718-1970-0258249-6

  • Grewal MS, Andrews AP (2008) Kalman filtering: theory and practice using MATLAB, 3rd edn. Wiley, Hoboken

  • Haghighi AD (2014) Numerical optimization: understanding L-BFGS. URL: http://aria42.com/blog/2014/12/understanding-lbfgs. Accessed 2 Dec 2014

  • Hopke PK, Liu C, Rubin DB (2001) Multiple imputation for multivariate data with missing and below-threshold measurements: time-series concentrations of pollutants in the Arctic. Biometrics 57:22–33

    Article  Google Scholar 

  • Hyndman RJ, Athanasopoulos G (2018) Forecasting: principles and practice, 2nd edn. OTexts, Melbourne

    Google Scholar 

  • Kabacoff RI (2015) R in action: data analysis and graphics with R, 2nd edn. Manning, Shelter Island

  • Kang HM, Yusof F, Mohamad I (2012) Imputation of missing data with different missingness mechanism. Jurnal Teknologi 57:57-67. https://doi.org/10.11113/jt.v57.1523

  • Kendall MG (1976) Time-series, 2nd edn. Griffin, London

    Google Scholar 

  • Kisaka MO, Mucheru-Muna M, Ngetich FK, Mugwe J, Mugendi D, Mairura F, Shisanya C, Makokha GL (2016) Potential of deterministic and geostatistical rainfall interpolation under high rainfall variability and dry spells: case of Kenya’s central highlands. Theor Appl Climatol 124:349–364. https://doi.org/10.1007/s00704-015-1413-2

    Article  Google Scholar 

  • Li J, Heap AD (2014) Spatial interpolation methods applied in the environmental sciences: a review. Environ Model Softw 53:173–189. https://doi.org/10.1016/j.envsoft.2013.12.008

    Article  Google Scholar 

  • Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, Hoboken

    Book  Google Scholar 

  • Lukoševičius M, Jaeger H (2009) Reservoir computing approaches to recurrent neural network training. Comp Sci Rev 3:127–149. https://doi.org/10.1016/j.cosrev.2009.03.005

    Article  Google Scholar 

  • Luo Y, Cai X, Zhang Y, Xu J, Yuan X (2018) Multivariate time series imputation with generative adversarial networks. In: 32nd Conference on Neural Information Processing Systems. Montréal, Canada

  • Massetti L (2014) Analysis and estimation of the effects of missing values on the calculation of monthly temperature indices. Theor Appl Climatol 117:511–519. https://doi.org/10.1007/s00704-013-1024-8

    Article  Google Scholar 

  • Moritz S, Bartz-Beielstein T (2017) imputeTS: time series missing value imputation in R. R J 9:207–218. https://doi.org/10.32614/RJ-2017-009

  • Moskowitz MA (2002) A course in complex analysis in one variable. World Scientific, River Edge

    Book  Google Scholar 

  • Mudelsee M (2014) Climate time series analysis: classical statistical and bootstrap methods, 2nd edn. Springer, New York

  • Myers DE (1994) Spatial interpolation: an overview. Geoderma 62:17–28. https://doi.org/10.1016/0016-7061(94)90025-6

    Article  Google Scholar 

  • Navarra A, Simoncini V (2010) A guide to empirical orthogonal functions for climate data analysis. Springer, Dordrecht

    Book  Google Scholar 

  • Nocedal J, Wright SJ (2006) Numerical optimization, 2nd edn. Springer, New York

    Google Scholar 

  • Pasini A (2015) Artificial neural networks for small dataset analysis. J Thoracic Dis 7:953–960. https://doi.org/10.3978/j.issn.2072-1439.2015.04.61

    Article  Google Scholar 

  • Philip GM, Watson DF (1982) A precise method for determining contoured surfaces. Appea J 22:205–212. https://doi.org/10.1071/AJ81016

  • Proakis JG, Manolakis DG (1996) Digital signal processing: principles, algorithms, and applications, 3rd edn. Prentice-Hall, Upper Saddle River

  • Schneider T (2001) Analysis of incomplete climate data: estimation of mean values and covariance matrices and imputation of missing values. J Clim 14:853–871. https://doi.org/10.1175/1520-0442(2001)014<0853:AOICDE>2.0.CO;2

    Article  Google Scholar 

  • Shen SSP, Somerville RCJ (2019) Climate mathematics: theory and applications, 1st edn. Cambridge University Press, Cambridge

  • Shumway RH, Stoffer DS (2017) Time series analysis and its applications: with R examples, 4th edn. Springer, New York

  • Simolo C, Brunetti M, Maugeri M, Nanni T (2010) Improving estimation of missing values in daily precipitation series by a probability density function-preserving approach. Int J Climatol 30:1564–1576. https://doi.org/10.1002/joc.1992

    Article  Google Scholar 

  • Smith SW (1999) The scientist and engineer’s guide to digital signal processing, 2nd edn. California Technical Publishing, San Diego

    Google Scholar 

  • Stooksbury DE, Idso CD, Hubbard KG (1999) The effects of data gaps on the calculated monthly mean maximum and minimum temperatures in the continental United States: a spatial and temporal study. J Clim 12:1524–1533. https://doi.org/10.1175/1520-0442(1999)0122.0.CO;2

    Article  Google Scholar 

  • van Buuren S (2012) Flexible imputation of missing data, 2nd edn. Chapman and Hall/CRC, Boca Raton

    Book  Google Scholar 

  • Vincent LA, Wang XL, Milewska EJ, Wan H, Yang F, Swail V (2012) A second generation of homogenized Canadian monthly surface air temperature for climate trend analysis. J Geophys Res 117:D18110. https://doi.org/10.1029/2012JD017859

  • von Storch H, Zwiers FW (1999) Statistical analysis in climate research. Cambridge University Press, Cambridge

  • Wallace JM, Hobbs PV (2006) Atmospheric science: an introductory survey, 2nd edn. Elsevier Academic Press, Amsterdam

    Google Scholar 

  • Wang XL, Swail VR (2001) Changes of extreme Wave Heights in northern hemisphere oceans and related atmospheric circulation regimes. J Clim 14:2204–2221. https://doi.org/10.1175/1520-0442(2001)014<2204:COEWHI>2.0.CO;2

    Article  Google Scholar 

  • Watson DF, Philip GM (1985) A refinement of inverse distance weighted interpolation. Geoprocessing 2:315–327

    Google Scholar 

  • Wilks DS (2019) Statistical methods in the atmospheric sciences, 4th edn. Elsevier, Cambridge

    Google Scholar 

  • Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30:79–82. https://doi.org/10.3354/cr030079

  • Xu C, Wang J, Hu M, Li Q (2013) Interpolation of missing temperature data at meteorological stations using P-BSHADE. J Clim 26:7452–7463. https://doi.org/10.1175/JCLI-D-12-00633.1

  • Zhang Z (2018) Multivariate time series analysis in climate and environmental research. Springer International Publishing, Cham

Download references

Acknowledgements

This work was supported by the Chinese Ministry of Science and Technology (MOST) National Key R&D Program (No.2018YFA0605603) and the Science Foundation Program of Guangxi University of Science and Technology (No.1711311). The authors thank Yunxin Huang, Tianlin Zhai, and Huqiang Qin for their kind assistance during manuscript writing, and the reviewers for providing constructive comments, which greatly improved this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guoyu Ren.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qin, Y., Ren, G., Zhang, P. et al. An imputation method for the climatic data with strong seasonality and spatial correlation. Theor Appl Climatol 144, 203–213 (2021). https://doi.org/10.1007/s00704-021-03537-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00704-021-03537-9

Keywords

Navigation