Abstract
Missing data estimation (MDE) for time series is a crucial issue concerned with various applications based on the Internet of Things (e.g. sparse mobile crowdsensing). Although many approaches have been proposed to address this issue, they are either insufficiently considered or computationally expensive for data correlations. Motivated by this, an echo state network (ESN) with bidirectional-feedback connections is first proposed to skillfully encode temporal, cross-domain, and lagging correlations into the high-dimensional state of the reservoir. Meanwhile, the output weight (the only unknown parameter of the network) can be trained quickly by reservoir computing to reduce computation overhead. On this basis, an improved version with multiple reservoirs is designed to further integrate data correlations from information flows in different directions. Finally, a general process for MDE based on ESN architecture is developed for popularization and application. Experimental results of various missing rates under different missing mechanisms show that the proposed models perform better than the current methods in estimation accuracy.
Similar content being viewed by others
References
Bae, B., Kim, H., Lim, H., et al.: Missing data imputation for traffic flow speed using spatio-temporal cokriging. Transp. Res. Part C Emerg. Technol. 88, 124–139 (2018). https://doi.org/10.1016/j.trc.2018.01.015
Bai, Y., Wang, D.: On the comparison of trilinear, cubic spline, and fuzzy interpolation methods in the high-accuracy measurements. IEEE Trans. Fuzzy Syst. 18(5), 1016–1022 (2010). https://doi.org/10.1109/TFUZZ.2010.2064170
Bianchi, F.M., Maiorino, E., Kampffmeyer, M.C., et al.: Recurrent neural networks for short-term load forecasting: an overview and comparative analysis. Springer Briefs in Computer Science. Springer, Cham, pp. 58–60 (2017). https://doi.org/10.1007/978-3-319-70338-1.
Cao, W., Wang, D., Li, J., et al. BRITS: bidirectional recurrent imputation for time series. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018. pp. 6776–6786
Che, Z., Purushotham, S., Cho, K., et al.: Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 8(1), 1–12 (2018). https://doi.org/10.1038/s41598-018-24271-9
Chen, X., Wei, Z., Li, Z., et al.: Ensemble correlation-based low-rank matrix completion with applications to traffic data imputation. Knowl. Based Syst. 132, 249–262 (2017). https://doi.org/10.1016/j.knosys.2017.06.010
Chen, X., He, Z., Chen, Y., et al.: Missing traffic data imputation and pattern discovery with a bayesian augmented tensor factorization model. Trans. Res. Part C Emerg. Technol. 104, 66–77 (2019). https://doi.org/10.1016/j.trc.2019.03.003
Chen, X., Yang, J., Sun, L.: A nonconvex low-rank tensor completion model for spatiotemporal traffic data imputation. Transp. Res. Part C Emerg. Technol. 117, 102673 (2020). https://doi.org/10.1016/j.trc.2020.102673
Chen, M., Liu, A., Liu, W., et al.: RDRL: a recurrent deep reinforcement learning scheme for dynamic spectrum access in reconfigurable wireless networks. IEEE Trans. Netw. Sci. Eng. 9(2), 364–376 (2021). https://doi.org/10.1109/TNSE.2021.3117565
Chen, M., Liu, W., Wang, T., et al.: A game-based deep reinforcement learning approach for energy-efficient computation in MEC systems. Knowl. Based Syst. 235, 107660 (2022). https://doi.org/10.1016/j.knosys.2021.107660
Choudhury, S.J., Pal, N.R.: Imputation of missing data with neural networks for classification. Knowl. Based Syst. (2019). https://doi.org/10.1016/j.knosys.2019.07.009
Chouikhi, N., Ammar, B., Rokbani, N., et al.: PSO-based analysis of Echo State Network parameters for time series forecasting. Appl. Soft Comput. 55, 211–225 (2017). https://doi.org/10.1016/j.asoc.2017.01.049
Ding, Z., Mei, G., Cuomo, S., et al.: Comparison of estimating missing values in IoT time series data using different interpolation algorithms. Int. J. Parallel Prog. 48(3), 534–548 (2020). https://doi.org/10.1007/s10766-018-0595-5
Du, J., Chen, H., Zhang, W.: A deep learning method for data recovery in sensor networks using effective spatio-temporal correlation data. Sens. Rev. 39(2), 208–217 (2019). https://doi.org/10.1108/SR-02-2018-0039
Du, J., Hu, M., Zhang, W.: Missing data problem in the monitoring system: a review. IEEE Sens. J. 20(23), 13984–13998 (2020)
Dua, D., Graff, C.: UCI machine learning repository. University of California, School of Information and Computer Science, Irvine (2019). http://archive.ics.uci.edu/ml.
Duan, Y., Lv, Y., Liu, Y.L., et al.: An efficient realization of deep learning for traffic data imputation. Trans. Res. Part C Emerg. Technol. 72, 168–181 (2016). https://doi.org/10.1016/j.trc.2016.09.015
Fekade, B., Maksymyuk, T., Kyryk, M., et al.: Probabilistic recovery of incomplete sensed data in IoT. IEEE Internet Things J. 5(4), 2282–2292 (2017). https://doi.org/10.1016/j.neunet.2017.04.005
Han, L., Yu, Z., Wang, L., et al.: Keeping cell selection model up-to-date to adapt to time-dependent environment in sparse mobile crowdsensing. IEEE Internet Things J. 8(18), 13914–13925 (2021). https://doi.org/10.1109/JIOT.2021.3068415
Hastie, T., Mazumder, R., Lee, J.D., et al.: Matrix completion and low-rank SVD via fast alternating least squares. J. Mach. Learn. Res. 16(1), 3367–3402 (2015)
Huang, J., Mao, B., Bai, Y., et al.: An integrated fuzzy C-means method for missing data imputation using taxi GPS Data. Sensors. (2020). https://doi.org/10.3390/s20071992
Junninen, H., Niska, H., Tuppurainen, K., et al.: Methods for imputation of missing values in air quality data sets. Atmos. Environ. 38(18), 2895–2907 (2004). https://doi.org/10.1016/j.atmosenv.2004.02.026
Karkouch, A., Mousannif, H., Al Moatassime, H., et al.: Data quality in internet of things: a state-of-the-art survey. J. Netw. Comput. Appl. 73, 57–81 (2016). https://doi.org/10.1016/j.jnca.2016.08.002
Kim, Y.J., Chi, M.: Temporal belief memory: imputing missing data during RNN training. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 2326–2332 (2018)
Kong, L., Xia, M., Liu, X.Y., et al.: Data loss and reconstruction in wireless sensor networks. IEEE Trans. Parallel Distrib. Syst. 25(11), 2818–2828 (2013). https://doi.org/10.1109/TPDS.2013.269
Li, H., Li, M., Lin, X., et al.: A spatiotemporal approach for traffic data imputation with complicated missing patterns. Transp. Res. Part C Emerg. Technol. 119, 102730 (2020). https://doi.org/10.1016/j.trc.2020.102730
Lin, W.C., Tsai, C.F.: Missing value imputation: a review and analysis of the literature (2006–2017). Artif. Intell. Rev. 53(2), 1487–1509 (2020). https://doi.org/10.1007/s10462-019-09709-4
Løkse, S., Bianchi, F.M., Jenssen, R.: Training echo state networks with regularization through dimensionality reduction. Cogn. Comput. 9(3), 364–378 (2017). https://doi.org/10.1007/s12559-017-9450-z
Lukoševičius, M.: A practical guide to applying echo state networks. In: Neural networks: tricks of the trade. Springer, Berlin, Heidelberg, pp. 659–686 (2012)
Luo, Y., Cai, X., Zhang, Y., et al.: Multivariate time series imputation with generative adversarial networks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 1603–1614 (2018)
Marchang, N., Tripathi, R.: KNN-ST: exploiting spatio-temporal correlation for missing data inference in environmental crowd sensing. IEEE Sens. J. 21(3), 3429–3436 (2020). https://doi.org/10.1109/JSEN.2020.3024976
Moshenberg, S., Lerner, U., Fishbain, B.: Spectral methods for imputation of missing air quality data. Environ. Syst. Res. 4(1), 1–13 (2015). https://doi.org/10.1186/s40068-015-0052-z
Nikfalazar, S., Yeh, C.H., Bedingfield, S., et al.: Missing data imputation using decision trees and fuzzy clustering with iterative learning. Knowl. Inf. Syst. 62(6), 2419–2437 (2020). https://doi.org/10.1007/s10115-019-01427-1
Pati, S.K., Das, A.K.: Missing value estimation for microarray data through cluster analysis. Knowl. Inf. Syst. 52(3), 709–750 (2017). https://doi.org/10.1007/s10115-017-1025-5
Rahman, M.G., Islam, M.Z.: Missing value imputation using decision trees and decision forests by splitting and merging records: two novel techniques. Knowl. Based Syst. 53, 51–65 (2013). https://doi.org/10.1016/j.knosys.2013.08.023
Ren, Y., Liu, W., Liu, A., et al.: A privacy-protected intelligent crowdsourcing application of IoT based on the reinforcement learning. Fut. Gen. Comput. Syst. 127, 56–69 (2022). https://doi.org/10.1016/j.future.2021.09.003
Resche-Rigon, M., White, I.R.: Multiple imputation by chained equations for systematically and sporadically missing multilevel data. Stat. Methods Med. Res. 27(6), 1634–1649 (2018). https://doi.org/10.1177/0962280216666564
Shao, J., Meng, W., Sun, G.: Evaluation of missing value imputation methods for wireless soil datasets. Pers. Ubiquit. Comput. 21(1), 113–123 (2017). https://doi.org/10.1007/s00779-016-0978-9
Shtiliyanova, A., Bellocchi, G., Borras, D., et al.: Kriging-based approach to predict missing air temperature data. Comput. Electron. Agric. 142, 440–449 (2017). https://doi.org/10.1016/j.compag.2017.09.033
Silva-Ramírez, E.L., Pino-Mejías, R., López-Coello, M.: Single imputation with multilayer perceptron and multiple imputation combining multilayer perceptron and k-nearest neighbours for monotone patterns. Appl. Soft Comput. 29, 65–74 (2015). https://doi.org/10.1016/j.asoc.2014.09.052
Song, X., Guo, Y., Li, N., et al.: A novel approach for missing data prediction in coevolving time series. Computing 101(11), 1565–1584 (2019). https://doi.org/10.1007/s00607-018-0668-8
Song, X., Ye, Y., Yu, J.J.Q.: TINet: multi-dimensional traffic data imputation via transformer network. In: Proceedings of International Conference on Artificial Neural Networks. Springer, Cham, pp. 306–317 (2021). https://doi.org/10.1007/978-3-030-86362-3_25.
Tang, F., Ishwaran, H.: Random forest missing data algorithms. Stat. Anal. Data Min. ASA Data Sci. J. 10(6), 363–377 (2017). https://doi.org/10.1002/sam.11348
Tutz, G., Ramzan, S.: Improved methods for the imputation of missing data by nearest neighbor methods. Comput. Stat. Data Anal. 90, 84–99 (2015). https://doi.org/10.1016/j.csda.2015.04.009
Vlachas, P.R., Pathak, J., Hunt, B.R., et al.: Backpropagation algorithms and reservoir computing in recurrent neural networks for the forecasting of complex spatiotemporal dynamics. Neural Netw. 126, 191–217 (2020). https://doi.org/10.1016/j.neunet.2020.02.016
Wang, L., Zhang, D., Wang, Y., et al.: Sparse mobile crowdsensing: challenges and opportunities. IEEE Commun. Mag. 54(7), 161–167 (2016). https://doi.org/10.1109/MCOM.2016.7509395
Weerakody, P.B., Wong, K.W., Wang, G., et al.: A review of irregular time series data handling with gated recurrent neural networks. Neurocomputing 441, 161–178 (2021). https://doi.org/10.1016/j.neucom.2021.02.046
Xie, K., Ning, X., Wang, X., et al.: Recover corrupted data in sensor networks: a matrix completion solution. IEEE Trans. Mob. Comput. 16(5), 1434–1448 (2016). https://doi.org/10.1109/TMC.2016.2595569
Xu, M., Han, M.: Adaptive elastic echo state network for multivariate time series prediction. IEEE Trans. Cybern. 46(10), 2173–2183 (2016). https://doi.org/10.1109/TCYB.2015.2467167
Xu, M., Yang, Y., Han, M., et al.: Spatio-temporal interpolated echo state network for meteorological series prediction. IEEE Trans. Neural Netw. Learn. Syst. 30(6), 1621–1634 (2018). https://doi.org/10.1109/TNNLS.2018.2869131
Yao, Q., Kwok, J.T.Y., Han, B.: Efficient nonconvex regularized tensor completion with structure-aware proximal iterations. In: Proceedings of International Conference on Machine Learning, pp. 7035–7044 (2019)
Yoon, J., Zame, W.R., van der Schaar, M.: Estimating missing data in temporal data streams using multi-directional recurrent neural networks. IEEE Trans. Biomed. Eng. 66(5), 1477–1490 (2018). https://doi.org/10.1109/TBME.2018.2874712
Yu, H.F., Rao, N., Dhillon, I.S.: Temporal regularized matrix factorization for high-dimensional time series prediction. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 847–855 (2016)
Yu, Z., Zheng, X., Huang, F., et al.: A framework based on sparse representation model for time series prediction in smart city. Front. Comp. Sci. 15(1), 1–13 (2021). https://doi.org/10.1007/s11704-019-8395-7
Zhang, Y.: Dynamic effect analysis of meteorological conditions on air pollution: a case study from Beijing. Sci. Total Environ. 684, 178–185 (2019). https://doi.org/10.1016/j.scitotenv.2019.05.360
Zhang, Q., Yuan, Q., Zeng, C., et al.: Missing data reconstruction in remote sensing image with a unified spatial–temporal–spectral deep convolutional neural network. IEEE Trans. Geosci. Remote Sens. 56(8), 4274–4288 (2018). https://doi.org/10.1109/TGRS.2018.2810208
Zhang, S., Gong, L., Zeng, Q., et al.: Imputation of GPS coordinate time series using MissForest. Remote Sens. (2021). https://doi.org/10.3390/rs13122312
Zhang, Z., Lin, X., Li, M., et al.: A customized deep learning approach to integrate network-scale online traffic data imputation and prediction. Trans. Res. Part C Emerg. Technol. (2021). https://doi.org/10.1016/j.trc.2021.103372
Acknowledgements
This work is supported by the National Natural Science Foundation of China under grant No. 61772136, the Fujian Engineering Research Center of Big Data Analysis and Processing.
Funding
This work was supported by the National Key Research and Development Program of China.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Huang, F., Zheng, W., Guo, W. et al. Estimating missing data for sparsely sensed time series with exogenous variables using bidirectional-feedback echo state networks. CCF Trans. Pervasive Comp. Interact. 5, 45–63 (2023). https://doi.org/10.1007/s42486-022-00112-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42486-022-00112-7