Data Imputation in Wireless Sensor Network Using Deep Learning Techniques

Rani, Shweta; Solanki, Arun

doi:10.1007/978-981-15-8335-3_44

Shweta Rani⁷ &
Arun Solanki⁷

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 54))

1184 Accesses
9 Citations

Abstract

Missing information in the sequence of data provided by the wireless sensor network is a prevalent issue. There can be many reasons which may be responsible for this like network loss, sensor maintenance, and sensor failure. Retrieving the missing data from the time series data obtained by the wireless sensor network proves to be a difficult task. There have been many methods which try to recover this data, but limitation still exists. The proposed work discusses the use of deep learning algorithms for the time series prediction of WSN data generated from real-time sensor devices. This paper examines various techniques available in deep learning for time series forecasting and analyzes the results obtained from various methods and at the end, gives the best hybrid combination of which contains the bidirectional LSTM layer for a deep examination of the pattern in the data. The main motivation behind the work is to improve the working environment of the WSN fields, automate various processess of maintenance, and provide an effective method of data imputation in the wireless sensor network in a real-time environment when there is a scarcity of data and also, develop a method for effective forecasting. Another approach that is examined is a CNN layer to find the positional pattern in the data, attention mechanism to put more focus on the relevant part of the sequence, an LSTM layer which makes it an encoder–decoder model, and a dense layer at the end to produce the output in the desired shape. The model is trained and tested on Beijing Air Pollution PM2.5 Dataset. To overcome the problem of lower availability of data, VLSW algorithm is used, which helped in generating the large sample of training data from the limited available datasets. After analyzing and studying the various models and their results with various hyperparameters tried, it is concluded that the model with the LSTM attention mechanism with the encoder–decoder model along with VLSW works best for long missing time series data imputation. Other studied models and their results are summarized too. This work concludes based on the practical and real-time application of data imputation that the CNN model with online training works best as it takes less time and resources to train. The results obtained are better from the existing solution available. The SSIM model with VLSW performed 31% more efficient than other methods, while CNN without VLSW performed 76% more efficient than other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Wang H, Yang G, Xu J, Chen Z, Chen L, Yang Z (2011) A novel data collection approach for Wirelsee Sensor Networks. In: 2011 international conference on electrical and control engineering, Yichang, 2011, pp 4287-4290. https://doi.org/10.1109/ICECENG.2011.6057687.
Agarwal A, Solanki A (2016) An improved data clustering algorithm for outlier detection. Self-organology 3(4):121–139
Google Scholar
Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, Hoboken, NJ
Google Scholar
Zhou J, Huang Z (2018) Recover missing sensor data with iterative imputing network. In: Proceedings of workshops 32nd AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, Feb 2018, pp 209–215
Google Scholar
Ahuja R, Solanki A (2019) Movie recommender system using K-Means clustering and K-Nearest Neighbor. In: Confluence-2019: 9th international conference on cloud computing, data science & engineering, Amity University, Noida, vol 1231, no 21, pp 25–38 (accepted for publication)
Google Scholar
Lv P, Yue L (2011) Short-term wind speed forecasting based on non-stationary time series analysis and ARCH model. In: 2011 international conference on multimedia technology, Hangzhou, 2011, pp 2549–2553. https://doi.org/10.1109/ICMT.2011.6002447
Thissen U, Brakel RV, Weijer APD et al (2003) Using support vector machines for time series prediction. Chemometr Intell Lab Syst 69(1–2):35–49
Article Google Scholar
Natarajan VA, Karatampati P (2019) Survey on renewable energy forecasting using different techniques. In: 2019 2nd international conference on power and embedded drive control (ICPEDC), Chennai, India, 2019, pp 349–354. https://doi.org/10.1109/ICPEDC47771.2019.9036569
Singh T, Nayyar A, Solanki A (2020) Multilingual opinion mining movie recommendation system using RNN. In: Singh P, Pawłowski W, Tanwar S, Kumar N, Rodrigues J, Obaidat M (eds) Proceedings of first international conference on computing, communications, and cyber-security (IC4S (2019). Lecture notes in networks and systems, vol 121. Springer, Singapore
Google Scholar
Che Z, Purushotham S, Cho K et al (2018) Recurrent neural networks for multivariate time series with missing values. Sci Rep 8:6085. https://doi.org/10.1038/s41598-018-24271-9
Article Google Scholar
Xue N, Triguero I, Figueredo GP, Landa-Silva D (2019) Evolving deep CNN-LSTMs for inventory time series prediction. In: 2019 IEEE congress on evolutionary computation (CEC). https://doi.org/10.1109/cec.2019.8789957
Zhang GP (2003) Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50:159–175
Article Google Scholar
Wang B, Sun Y, Xue B, Zhang M (2018) Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. arXiv preprint arXiv:1803.06492
Du S et al (2018) Deep air quality forecasting using hybrid deep learning framework. arXiv preprint arXiv:1812.04783
Du S, Li T, Gong X, Yang Y, Horng SJ (2017) Traffic flow forecasting based on hybrid deep learning framework. In: 2017 12th international conference on intelligent systems and knowledge engineering (ISKE), Nov 2017, pp 1–6
Google Scholar
Siami-Namini S, Tavakoli N, Siami Namin A (2018) A comparison of ARIMA and LSTM in forecasting time series, pp 1394–1401. https://doi.org/10.1109/ICMLA.2018.00227
Graham JW (2009) Missing data analysis: making it work in the real world. Annu Rev Psychol 60:549–576 Jan
Article Google Scholar
Lachtermacher G, Fuller JD (1995) Back propagation in time-series forecasting. J Forecast 14(4):381–393
Article Google Scholar
Hyndman RJ, Koehler AB (2006) Another look at measures of forecast accuracy. Int J Forecast 22(4):679–688
Article Google Scholar
Rana S, John AH, Midi H (2012) Robust regression imputation for analyzing missing data. In: 2012 international conference on statistics in science, business and engineering (ICSSBE), Langkawi, 2012, pp 1–4. https://doi.org/10.1109/ICSSBE.2012.6396621
Kim T, Kim HY (2019) Forecasting stock prices with a feature fusion LSTM-CNN model using different representations of the same data. PLoS ONE 14(2):e0212320
Article Google Scholar
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation
Google Scholar
Verma H, Kumar S (2019) An accurate missing data prediction method using LSTM based deep learning for health care. In: Proceedings of 20th international conference on distributed computing and networking, pp 371–376
Google Scholar
Harvey AC (1990) Forecasting, structural time series models and the Kalman filter. Cambridge University Press, Cambridge
Google Scholar
Fung DS (2006) Methods for the estimation of missing values in time series
Google Scholar
Azur MJ, Stuart EA, Frangakis C, Leaf PJ (2011) Multiple imputation by chained equations: what is it and how does it work? Int J Methods Psychiatr Res 20(1):40–49
Article Google Scholar
Yi X, Zheng Y, Zhang J, Li T (2016) ST-MVL: filling missing values in geo-sensory time series data
Google Scholar
Wang J, De Vries AP, Reinders MJ (2006) Unifying user-based and item-based collaborative filtering approaches by similarity fusion. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 501–508
Google Scholar
Yuan H, Xu G, Yao Z, Jia J, Zhang Y (2018) Imputation of missing data in time series for air pollutants using long short-term memory recurrent neural networks. In: Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing and Symposium on Wearable Computers, 2018, pp 1293–1300
Google Scholar
Che Z, Purushotham S, Cho K, Sontag D, Liu Y (2018) Recurrent neural networks for multivariate time series with missing values. Sci Rep 8(1):6085
Article Google Scholar
Ilya S, Oriol V, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems 27: annual conference on neural information processing systems Montreal. Quebec, Canada, 8–13 Dec 2014, pp 3104–3112
Google Scholar
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078
Singh G, Solanki A (2016) An algorithm to transform natural language into SQL queries for relational databases. Selforganizology 3(3):100–116
Google Scholar
Leke C, Twala B, Marwala T (2014) Missing data prediction and classification: the use of auto-associative neural networks and optimization algorithms. arXiv preprint arXiv:1403.5488
Tayal A, Köse U, Solanki A, Nayyar A, ve Marmolejo Saucedo JA (2019) Efficiency analysis for stochastic dynamic facility layout problem using meta-heuristic, data envelopment analysis and machine learning. Comput Intell (Basımda). https://doi.org/10.1111/coin.12251
Moghar A, Hamiche M (2020) Stock market prediction using LSTM recurrent neural network. Procedia Comput Sci 170:1168–1173. https://doi.org/10.1016/j.procs.2020.03.049
Article Google Scholar
Yang J, Nguyen MN, San PP, Li X, Krishnaswamy S (2015) Deep convolutional neural networks on multichannel time series for human activity recognition. In Ijcai, 2015, vol 15, pp 3995–4001
Google Scholar
Yang Y, Guizhong L (2001) Multivariate time series prediction based on neural networks applied to stock market. In: 2001 IEEE international conference on systems, man and cybernetics. e-systems and e-man for cybernetics in cyberspace (Cat. No. 01CH37236), vol 4. IEEE
Google Scholar
Lin T, Guo T, Aberer K (2017) Hybrid neural networks for learning the trend in time series. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence, IJCAI-17, pp 2273–2279
Google Scholar
Pandey S, Solanki A (2019) Music instrument recognition using deep convolutional neural networks. Int J Inf Technol 13(3):129–149
Google Scholar
Zhang Y-F, Thorburn P, Xiang W, Fitch P (2019) SSIM—a deep learning approach for recovering missing time series sensor data. IEEE Internet Things J 1. https://doi.org/10.1109/jiot.2019.2909038
Kaur N, Solanki A (2018) Sentiment knowledge discovery in twitter using CoreNLP library. In: 8th international conference on cloud computing, data science and engineering (confluence), vol 345, no 32, pp 2342–2358
Google Scholar
Rajput R, Solanki A (2016) Real-time analysis of tweets using machine learning and semantic analysis. In: International conference on communication and computing systems (ICCCS-2016). Taylor and Francis, at Dronacharya College of Engineering, Gurgaon, 9–11 Sept, vol 138, issue 25, pp 687–692
Google Scholar
Rajput R, Solanki A (2016) Review of sentimental analysis methods using lexicon based approach. Int J Comput Sci Mob Comput 5(2):159–166
Google Scholar
Priyadarshni V, Nayyar A, Solanki A, Anuragi A (2019) Human age classification system using K-NN classifier. In: Luhach A, Jat D, Hawari K, Gao XZ, Lingras P (eds) Advanced informatics for computing research. ICAICR 2019. Communications in computer and information science, vol 1075. Springer, Singapore
Google Scholar

Download references

Author information

Authors and Affiliations

Gautam Buddha University, Greater Noida, India
Shweta Rani & Arun Solanki

Authors

Shweta Rani
View author publications
You can also search for this author in PubMed Google Scholar
Arun Solanki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arun Solanki .

Editor information

Editors and Affiliations

Maharaja Agrasen Institute of Technology, New Delhi, India
Ashish Khanna
Maharaja Agrasen Institute of Technology, New Delhi, India
Deepak Gupta
Jan Wyzykowski University, Polkowice, Poland
Zdzisław Pólkowski
CHRIST (Deemed to be University), Bengaluru, India
Siddhartha Bhattacharyya
Tijuana Institute of Technology, Tijuana, Mexico
Oscar Castillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rani, S., Solanki, A. (2021). Data Imputation in Wireless Sensor Network Using Deep Learning Techniques. In: Khanna, A., Gupta, D., Pólkowski, Z., Bhattacharyya, S., Castillo, O. (eds) Data Analytics and Management. Lecture Notes on Data Engineering and Communications Technologies, vol 54. Springer, Singapore. https://doi.org/10.1007/978-981-15-8335-3_44

Download citation

DOI: https://doi.org/10.1007/978-981-15-8335-3_44
Published: 05 January 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8334-6
Online ISBN: 978-981-15-8335-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics