Abstract
Missing information in the sequence of data provided by the wireless sensor network is a prevalent issue. There can be many reasons which may be responsible for this like network loss, sensor maintenance, and sensor failure. Retrieving the missing data from the time series data obtained by the wireless sensor network proves to be a difficult task. There have been many methods which try to recover this data, but limitation still exists. The proposed work discusses the use of deep learning algorithms for the time series prediction of WSN data generated from real-time sensor devices. This paper examines various techniques available in deep learning for time series forecasting and analyzes the results obtained from various methods and at the end, gives the best hybrid combination of which contains the bidirectional LSTM layer for a deep examination of the pattern in the data. The main motivation behind the work is to improve the working environment of the WSN fields, automate various processess of maintenance, and provide an effective method of data imputation in the wireless sensor network in a real-time environment when there is a scarcity of data and also, develop a method for effective forecasting. Another approach that is examined is a CNN layer to find the positional pattern in the data, attention mechanism to put more focus on the relevant part of the sequence, an LSTM layer which makes it an encoder–decoder model, and a dense layer at the end to produce the output in the desired shape. The model is trained and tested on Beijing Air Pollution PM2.5 Dataset. To overcome the problem of lower availability of data, VLSW algorithm is used, which helped in generating the large sample of training data from the limited available datasets. After analyzing and studying the various models and their results with various hyperparameters tried, it is concluded that the model with the LSTM attention mechanism with the encoder–decoder model along with VLSW works best for long missing time series data imputation. Other studied models and their results are summarized too. This work concludes based on the practical and real-time application of data imputation that the CNN model with online training works best as it takes less time and resources to train. The results obtained are better from the existing solution available. The SSIM model with VLSW performed 31% more efficient than other methods, while CNN without VLSW performed 76% more efficient than other methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wang H, Yang G, Xu J, Chen Z, Chen L, Yang Z (2011) A novel data collection approach for Wirelsee Sensor Networks. In: 2011 international conference on electrical and control engineering, Yichang, 2011, pp 4287-4290. https://doi.org/10.1109/ICECENG.2011.6057687.
Agarwal A, Solanki A (2016) An improved data clustering algorithm for outlier detection. Self-organology 3(4):121–139
Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, Hoboken, NJ
Zhou J, Huang Z (2018) Recover missing sensor data with iterative imputing network. In: Proceedings of workshops 32nd AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, Feb 2018, pp 209–215
Ahuja R, Solanki A (2019) Movie recommender system using K-Means clustering and K-Nearest Neighbor. In: Confluence-2019: 9th international conference on cloud computing, data science & engineering, Amity University, Noida, vol 1231, no 21, pp 25–38 (accepted for publication)
Lv P, Yue L (2011) Short-term wind speed forecasting based on non-stationary time series analysis and ARCH model. In: 2011 international conference on multimedia technology, Hangzhou, 2011, pp 2549–2553. https://doi.org/10.1109/ICMT.2011.6002447
Thissen U, Brakel RV, Weijer APD et al (2003) Using support vector machines for time series prediction. Chemometr Intell Lab Syst 69(1–2):35–49
Natarajan VA, Karatampati P (2019) Survey on renewable energy forecasting using different techniques. In: 2019 2nd international conference on power and embedded drive control (ICPEDC), Chennai, India, 2019, pp 349–354. https://doi.org/10.1109/ICPEDC47771.2019.9036569
Singh T, Nayyar A, Solanki A (2020) Multilingual opinion mining movie recommendation system using RNN. In: Singh P, Pawłowski W, Tanwar S, Kumar N, Rodrigues J, Obaidat M (eds) Proceedings of first international conference on computing, communications, and cyber-security (IC4S (2019). Lecture notes in networks and systems, vol 121. Springer, Singapore
Che Z, Purushotham S, Cho K et al (2018) Recurrent neural networks for multivariate time series with missing values. Sci Rep 8:6085. https://doi.org/10.1038/s41598-018-24271-9
Xue N, Triguero I, Figueredo GP, Landa-Silva D (2019) Evolving deep CNN-LSTMs for inventory time series prediction. In: 2019 IEEE congress on evolutionary computation (CEC). https://doi.org/10.1109/cec.2019.8789957
Zhang GP (2003) Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50:159–175
Wang B, Sun Y, Xue B, Zhang M (2018) Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. arXiv preprint arXiv:1803.06492
Du S et al (2018) Deep air quality forecasting using hybrid deep learning framework. arXiv preprint arXiv:1812.04783
Du S, Li T, Gong X, Yang Y, Horng SJ (2017) Traffic flow forecasting based on hybrid deep learning framework. In: 2017 12th international conference on intelligent systems and knowledge engineering (ISKE), Nov 2017, pp 1–6
Siami-Namini S, Tavakoli N, Siami Namin A (2018) A comparison of ARIMA and LSTM in forecasting time series, pp 1394–1401. https://doi.org/10.1109/ICMLA.2018.00227
Graham JW (2009) Missing data analysis: making it work in the real world. Annu Rev Psychol 60:549–576 Jan
Lachtermacher G, Fuller JD (1995) Back propagation in time-series forecasting. J Forecast 14(4):381–393
Hyndman RJ, Koehler AB (2006) Another look at measures of forecast accuracy. Int J Forecast 22(4):679–688
Rana S, John AH, Midi H (2012) Robust regression imputation for analyzing missing data. In: 2012 international conference on statistics in science, business and engineering (ICSSBE), Langkawi, 2012, pp 1–4. https://doi.org/10.1109/ICSSBE.2012.6396621
Kim T, Kim HY (2019) Forecasting stock prices with a feature fusion LSTM-CNN model using different representations of the same data. PLoS ONE 14(2):e0212320
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation
Verma H, Kumar S (2019) An accurate missing data prediction method using LSTM based deep learning for health care. In: Proceedings of 20th international conference on distributed computing and networking, pp 371–376
Harvey AC (1990) Forecasting, structural time series models and the Kalman filter. Cambridge University Press, Cambridge
Fung DS (2006) Methods for the estimation of missing values in time series
Azur MJ, Stuart EA, Frangakis C, Leaf PJ (2011) Multiple imputation by chained equations: what is it and how does it work? Int J Methods Psychiatr Res 20(1):40–49
Yi X, Zheng Y, Zhang J, Li T (2016) ST-MVL: filling missing values in geo-sensory time series data
Wang J, De Vries AP, Reinders MJ (2006) Unifying user-based and item-based collaborative filtering approaches by similarity fusion. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 501–508
Yuan H, Xu G, Yao Z, Jia J, Zhang Y (2018) Imputation of missing data in time series for air pollutants using long short-term memory recurrent neural networks. In: Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing and Symposium on Wearable Computers, 2018, pp 1293–1300
Che Z, Purushotham S, Cho K, Sontag D, Liu Y (2018) Recurrent neural networks for multivariate time series with missing values. Sci Rep 8(1):6085
Ilya S, Oriol V, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems 27: annual conference on neural information processing systems Montreal. Quebec, Canada, 8–13 Dec 2014, pp 3104–3112
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078
Singh G, Solanki A (2016) An algorithm to transform natural language into SQL queries for relational databases. Selforganizology 3(3):100–116
Leke C, Twala B, Marwala T (2014) Missing data prediction and classification: the use of auto-associative neural networks and optimization algorithms. arXiv preprint arXiv:1403.5488
Tayal A, Köse U, Solanki A, Nayyar A, ve Marmolejo Saucedo JA (2019) Efficiency analysis for stochastic dynamic facility layout problem using meta-heuristic, data envelopment analysis and machine learning. Comput Intell (Basımda). https://doi.org/10.1111/coin.12251
Moghar A, Hamiche M (2020) Stock market prediction using LSTM recurrent neural network. Procedia Comput Sci 170:1168–1173. https://doi.org/10.1016/j.procs.2020.03.049
Yang J, Nguyen MN, San PP, Li X, Krishnaswamy S (2015) Deep convolutional neural networks on multichannel time series for human activity recognition. In Ijcai, 2015, vol 15, pp 3995–4001
Yang Y, Guizhong L (2001) Multivariate time series prediction based on neural networks applied to stock market. In: 2001 IEEE international conference on systems, man and cybernetics. e-systems and e-man for cybernetics in cyberspace (Cat. No. 01CH37236), vol 4. IEEE
Lin T, Guo T, Aberer K (2017) Hybrid neural networks for learning the trend in time series. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence, IJCAI-17, pp 2273–2279
Pandey S, Solanki A (2019) Music instrument recognition using deep convolutional neural networks. Int J Inf Technol 13(3):129–149
Zhang Y-F, Thorburn P, Xiang W, Fitch P (2019) SSIM—a deep learning approach for recovering missing time series sensor data. IEEE Internet Things J 1. https://doi.org/10.1109/jiot.2019.2909038
Kaur N, Solanki A (2018) Sentiment knowledge discovery in twitter using CoreNLP library. In: 8th international conference on cloud computing, data science and engineering (confluence), vol 345, no 32, pp 2342–2358
Rajput R, Solanki A (2016) Real-time analysis of tweets using machine learning and semantic analysis. In: International conference on communication and computing systems (ICCCS-2016). Taylor and Francis, at Dronacharya College of Engineering, Gurgaon, 9–11 Sept, vol 138, issue 25, pp 687–692
Rajput R, Solanki A (2016) Review of sentimental analysis methods using lexicon based approach. Int J Comput Sci Mob Comput 5(2):159–166
Priyadarshni V, Nayyar A, Solanki A, Anuragi A (2019) Human age classification system using K-NN classifier. In: Luhach A, Jat D, Hawari K, Gao XZ, Lingras P (eds) Advanced informatics for computing research. ICAICR 2019. Communications in computer and information science, vol 1075. Springer, Singapore
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Rani, S., Solanki, A. (2021). Data Imputation in Wireless Sensor Network Using Deep Learning Techniques. In: Khanna, A., Gupta, D., Pólkowski, Z., Bhattacharyya, S., Castillo, O. (eds) Data Analytics and Management. Lecture Notes on Data Engineering and Communications Technologies, vol 54. Springer, Singapore. https://doi.org/10.1007/978-981-15-8335-3_44
Download citation
DOI: https://doi.org/10.1007/978-981-15-8335-3_44
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8334-6
Online ISBN: 978-981-15-8335-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)