Abstract
This paper proposed the application of Bayesian Principal Component Analysis (BPCA) algorithm to address the issue of missing rainfall data in Kuching City. The experiment was conducted using six different combinations of rainfall data from different neighbouring rainfall stations at different missing data entries (1%, 5%, 10%, 15%, 20%, 25% and 30% of missing data entries). The performance of BPCA model in reconstructing the missing data was examined with respect to Bias (Bs), Efficiency (E) and Root Mean Square Error (RMSE). The reliability and robustness of BPCA was confirmed by comparing its performance with K-Nearest Neighbour (KNN) imputation model. The results support the addition of data from neighbouring rainfall stations to improve the imputation accuracy.
Similar content being viewed by others
References
Bennett N, Newham L, Croke B, Jakeman A (2007) Patching and disaccumulation of rainfall data for hydrological modelling. In: Int. Congress on Modelling and Simulation (MODSIM 2007), Modelling and Simulation Society of Australia and New Zealand Inc., New Zealand, p 2520–2526
Chai SS, Keat Wong W, Luong Goh K (2017) Rainfall classification for flood prediction using meteorology data of Kuching, Sarawak, Malaysia: backpropagation vs radial basis function neural network. International Journal of Environmental Science and Development (IJESD) 8:385–388. https://doi.org/10.18178/ijesd.2017.8.5.982
De Silva R, Dayawansa N, Ratnasiri M (2007) A comparison of methods used in estimating missing rainfall data. J Agric Sci 3:101–108
Ekeu-wei IT (2018) Evaluation of hydrological data collection challenges and flood estimation uncertainties in Nigeria. Environment and Natural Resources Research 8:44
Gill MK, Asefa T, Kaheil Y, McKee M (2007) Effect of missing data on performance of learning algorithms for hydrologic predictions: implications to an imputation technique. Water Resour Res 43:1–12. https://doi.org/10.1029/2006WR005298
Jajarmizadeh M, Harun S, Kuok KK, Sabari NS (2015) Contribution of climate forecast system meteorological data for flow prediction. In: Singapore. ISFRAM 2014. Springer Singapore, p 89–98
Kamaruzaman IF, Zin WZW, Ariff NM (2017) A comparison of method for treating missing daily rainfall data in peninsular Malaysia. Malaysian Journal of Fundamental and Applied Sciences (MJFAS) 13:375–380
Kuok KK, Bessaih N (2007) Artificial neural networks (ANNS) for daily rainfall runoff modelling. Journal-The Institution of Engineers, Malaysia 68:31–42
Kuok KK, Harun S, Shamsuddin S (2010) Particle swarm optimization feedforward neural network for modeling runoff. International Journal of Environmental Science & Technology (IJEST) 7:67–78
Lee H, Kang K (2015) Interpolation of missing precipitation data using kernel estimations for hydrologic modeling. Adv Meteorol 2015:1–12. https://doi.org/10.1155/2015/935868
Luk KC, Ball JE, Sharma A (2001) An application of artificial neural networks for rainfall forecasting. Math Comput Model 33:683–693
Malek MA (2008) Rainfall data in-filling model with expectation maximization and artificial neural network. PhD, Universiti Teknologi Malaysia
Malek MA, Shamsuddin SM, Harun S (2010) Restoration of hydrological data in the presence of missing data via Kohonen self organizing maps. In: Ramov B (ed) New trends in technologies. InTech, Rijeka, pp 223–243
Oba S (2013) [BPCAfill.m] BPCA Missing Value Estimator for MATLAB. Kyoto University. http://ishiilab.jp/member/oba/tools/BPCAFill.html. Accessed 4 May 2018
Oba S, Sato M-a, Takemasa I, Monden M, Matsubara K-i, Ishii S (2003) A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19:2088–2096
Oh-Wook K, Kwokleung C, Te-Won L (2003) Speech feature analysis using variational Bayesian PCA. IEEE Signal Processing Letters 10:137–140. https://doi.org/10.1109/LSP.2003.810017
Pagano TC et al (2014) Challenges of Operational River forecasting. J Hydrometeorol 15:1692–1707. https://doi.org/10.1175/JHM-D-13-0188.1
Sattari M-T, Rezazadeh-Joudi A, Kusiak A (2016) Assessment of different methods for estimation of missing data in precipitation studies. Hydrol Res:1–13
Severson AK, Molaro CM, Braatz DR (2017) Principal component analysis of process datasets with missing values processes 5:1–18. https://doi.org/10.3390/pr5030038
Shi F, Zhang D, Chen J, Karimi HR (2013) Missing value estimation for microarray data by Bayesian principal component analysis and iterative local least squares. Math Probl Eng 2013:5. https://doi.org/10.1155/2013/162938
Subashini P, Krishnaveni M (2011) Imputation of missing data using Bayesian Principal Component Analysis on TEC ionospheric satellite dataset. In: 2011 24th Canadian Conference on Electrical and Computer Engineering (CCECE), 8–11 May 2011. p 001540–001543. https://doi.org/10.1109/CCECE.2011.6030724
Wang G, Wang D, Yang J, Liu L (2016) Evaluation and correction of quantitative precipitation forecast by storm-scale NWP model in Jiangsu, China. Adv Meteorol 2016:1–13. https://doi.org/10.1155/2016/8476720
Acknowledgements
The authors wish to thank the reviewers for their feedback. The feedback contributes ideas and insights to improve this paper. This research did not receive any specific grant or fund from funding agencies in the public, commercial or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
No potential conflict of interest was reported by the authors.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lai, W.Y., Kuok, K.K. A Study on Bayesian Principal Component Analysis for Addressing Missing Rainfall Data. Water Resour Manage 33, 2615–2628 (2019). https://doi.org/10.1007/s11269-019-02209-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11269-019-02209-8