Skip to main content
Log in

A Study on Bayesian Principal Component Analysis for Addressing Missing Rainfall Data

  • Published:
Water Resources Management Aims and scope Submit manuscript

Abstract

This paper proposed the application of Bayesian Principal Component Analysis (BPCA) algorithm to address the issue of missing rainfall data in Kuching City. The experiment was conducted using six different combinations of rainfall data from different neighbouring rainfall stations at different missing data entries (1%, 5%, 10%, 15%, 20%, 25% and 30% of missing data entries). The performance of BPCA model in reconstructing the missing data was examined with respect to Bias (Bs), Efficiency (E) and Root Mean Square Error (RMSE). The reliability and robustness of BPCA was confirmed by comparing its performance with K-Nearest Neighbour (KNN) imputation model. The results support the addition of data from neighbouring rainfall stations to improve the imputation accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Bennett N, Newham L, Croke B, Jakeman A (2007) Patching and disaccumulation of rainfall data for hydrological modelling. In: Int. Congress on Modelling and Simulation (MODSIM 2007), Modelling and Simulation Society of Australia and New Zealand Inc., New Zealand, p 2520–2526

  • Chai SS, Keat Wong W, Luong Goh K (2017) Rainfall classification for flood prediction using meteorology data of Kuching, Sarawak, Malaysia: backpropagation vs radial basis function neural network. International Journal of Environmental Science and Development (IJESD) 8:385–388. https://doi.org/10.18178/ijesd.2017.8.5.982

  • De Silva R, Dayawansa N, Ratnasiri M (2007) A comparison of methods used in estimating missing rainfall data. J Agric Sci 3:101–108

    Google Scholar 

  • Ekeu-wei IT (2018) Evaluation of hydrological data collection challenges and flood estimation uncertainties in Nigeria. Environment and Natural Resources Research 8:44

  • Gill MK, Asefa T, Kaheil Y, McKee M (2007) Effect of missing data on performance of learning algorithms for hydrologic predictions: implications to an imputation technique. Water Resour Res 43:1–12. https://doi.org/10.1029/2006WR005298

  • Jajarmizadeh M, Harun S, Kuok KK, Sabari NS (2015) Contribution of climate forecast system meteorological data for flow prediction. In: Singapore. ISFRAM 2014. Springer Singapore, p 89–98

  • Kamaruzaman IF, Zin WZW, Ariff NM (2017) A comparison of method for treating missing daily rainfall data in peninsular Malaysia. Malaysian Journal of Fundamental and Applied Sciences (MJFAS) 13:375–380

  • Kuok KK, Bessaih N (2007) Artificial neural networks (ANNS) for daily rainfall runoff modelling. Journal-The Institution of Engineers, Malaysia 68:31–42

  • Kuok KK, Harun S, Shamsuddin S (2010) Particle swarm optimization feedforward neural network for modeling runoff. International Journal of Environmental Science & Technology (IJEST) 7:67–78

  • Lee H, Kang K (2015) Interpolation of missing precipitation data using kernel estimations for hydrologic modeling. Adv Meteorol 2015:1–12. https://doi.org/10.1155/2015/935868

    Article  Google Scholar 

  • Luk KC, Ball JE, Sharma A (2001) An application of artificial neural networks for rainfall forecasting. Math Comput Model 33:683–693

    Article  Google Scholar 

  • Malek MA (2008) Rainfall data in-filling model with expectation maximization and artificial neural network. PhD, Universiti Teknologi Malaysia

  • Malek MA, Shamsuddin SM, Harun S (2010) Restoration of hydrological data in the presence of missing data via Kohonen self organizing maps. In: Ramov B (ed) New trends in technologies. InTech, Rijeka, pp 223–243

    Google Scholar 

  • Oba S (2013) [BPCAfill.m] BPCA Missing Value Estimator for MATLAB. Kyoto University. http://ishiilab.jp/member/oba/tools/BPCAFill.html. Accessed 4 May 2018

  • Oba S, Sato M-a, Takemasa I, Monden M, Matsubara K-i, Ishii S (2003) A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19:2088–2096

    Article  Google Scholar 

  • Oh-Wook K, Kwokleung C, Te-Won L (2003) Speech feature analysis using variational Bayesian PCA. IEEE Signal Processing Letters 10:137–140. https://doi.org/10.1109/LSP.2003.810017

    Article  Google Scholar 

  • Pagano TC et al (2014) Challenges of Operational River forecasting. J Hydrometeorol 15:1692–1707. https://doi.org/10.1175/JHM-D-13-0188.1

    Article  Google Scholar 

  • Sattari M-T, Rezazadeh-Joudi A, Kusiak A (2016) Assessment of different methods for estimation of missing data in precipitation studies. Hydrol Res:1–13

  • Severson AK, Molaro CM, Braatz DR (2017) Principal component analysis of process datasets with missing values processes 5:1–18. https://doi.org/10.3390/pr5030038

  • Shi F, Zhang D, Chen J, Karimi HR (2013) Missing value estimation for microarray data by Bayesian principal component analysis and iterative local least squares. Math Probl Eng 2013:5. https://doi.org/10.1155/2013/162938

    Google Scholar 

  • Subashini P, Krishnaveni M (2011) Imputation of missing data using Bayesian Principal Component Analysis on TEC ionospheric satellite dataset. In: 2011 24th Canadian Conference on Electrical and Computer Engineering (CCECE), 8–11 May 2011. p 001540–001543. https://doi.org/10.1109/CCECE.2011.6030724

  • Wang G, Wang D, Yang J, Liu L (2016) Evaluation and correction of quantitative precipitation forecast by storm-scale NWP model in Jiangsu, China. Adv Meteorol 2016:1–13. https://doi.org/10.1155/2016/8476720

    Google Scholar 

Download references

Acknowledgements

The authors wish to thank the reviewers for their feedback. The feedback contributes ideas and insights to improve this paper. This research did not receive any specific grant or fund from funding agencies in the public, commercial or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wai Yan Lai.

Ethics declarations

Conflict of Interest

No potential conflict of interest was reported by the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lai, W.Y., Kuok, K.K. A Study on Bayesian Principal Component Analysis for Addressing Missing Rainfall Data. Water Resour Manage 33, 2615–2628 (2019). https://doi.org/10.1007/s11269-019-02209-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11269-019-02209-8

Keywords

Navigation