Skip to main content
Log in

Parallel spatio-temporal attention-based TCN for multivariate time series prediction

  • S.I. : Deep Social Computing
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

As industrial systems become more complex and monitoring sensors for everything from surveillance to our health become more ubiquitous, multivariate time series prediction is taking an important place in the smooth-running of our society. A recurrent neural network with attention to help extend the prediction windows is the current-state-of-the-art for this task. However, we argue that their vanishing gradients, short memories, and serial architecture make RNNs fundamentally unsuited to long-horizon forecasting with complex data. Temporal convolutional networks (TCNs) do not suffer from gradient problems and they support parallel calculations, making them a more appropriate choice. Additionally, they have longer memories than RNNs, albeit with some instability and efficiency problems. Hence, we propose a framework, called PSTA-TCN, that combines a parallel spatio-temporal attention mechanism to extract dynamic internal correlations with stacked TCN backbones to extract features from different window sizes. The framework makes full use parallel calculations to dramatically reduce training times, while substantially increasing accuracy with stable prediction windows up to 13 times longer than the status quo.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Liu F, Xue S, Wu J, Zhou C, Hu W, Paris C, Nepal S, Yang J, Yu PS (2020) Deep learning for community detection: progress, challenges and opportunities. arXiv:2005.08225

  2. Zhao R, Yan R, Chen Z, Mao K, Wang P, Gao RX (2019) Deep learning and its applications to machine health monitoring. Mech Syst Signal Process 115:213–237

    Article  Google Scholar 

  3. Christ M, Kempa-Liehr AW, Feindt M (2016) Distributed and parallel time series feature extraction for industrial big data applications. arXiv:1610.07717

  4. Yan H, Wan J, Zhang C, Tang S, Hua Q, Wang Z (2018) Industrial big data analytics for prediction of remaining useful life based on deep learning. IEEE Access 6:17 190-17 197

    Article  Google Scholar 

  5. Hou L, Bergmann NW (2012) Novel industrial wireless sensor networks for machine condition monitoring and fault diagnosis. IEEE Trans Instrum Meas 61(10):2787–2798

    Article  Google Scholar 

  6. Xu Y, Sun Y, Wan J, Liu X, Song Z (2017) Industrial big data for fault diagnosis: taxonomy, review, and applications. IEEE Access 5:17 368-17 380

    Article  Google Scholar 

  7. Wu J, Zhu X, Zhang C, Philip SY (2014) Bag constrained structure pattern mining for multi-graph classification. IEEE Trans Knowl Data Eng 26(10):2382–2396

    Article  Google Scholar 

  8. Wu J, Pan S, Zhu X, Cai Z (2014) Boosting for multi-graph classification. IEEE Trans Cybern 45(3):416–429

    Google Scholar 

  9. Li Y, Zhu Z, Kong D, Han H, Zhao Y (2019) Ea-lstm: evolutionary attention-based lstm for time series prediction. Knowl Based Syst 181:104785

    Article  Google Scholar 

  10. Hua Y, Zhao Z, Li R, Chen X, Liu Z, Zhang H (2019) Deep learning with long short-term memory for time series prediction. IEEE Commun Mag 57(6):114–119

    Article  Google Scholar 

  11. Bai S, Kolter JZ, Koltun V (2018) An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv:1803.01271

  12. Hübner R, Steinhauser M, Lehle C (2010) A dual-stage two-phase model of selective attention. Psychol Rev 117:759–84

    Article  Google Scholar 

  13. Liu Y, Gong C, Yang L, Chen Y (2020) Dstp-rnn: a dual-stage two-phase attention-based recurrent neural network for long-term and multivariate time series prediction. Expert Syst Appl 143:113082

    Article  Google Scholar 

  14. Qin Y, Song D, Chen H, Cheng W, Jiang G, Cottrell G (2017) A dual-stage attention-based recurrent neural network for time series prediction. arXiv:1704.02971

  15. Li H, Shen Y, Zhu Y (2018) Stock price prediction using attention-based multi-input lstm. In:Asian conference on machine learning. pp 454–469

  16. Soares E, Costa P Jr, Costa B, Leite D (2018) Ensemble of evolving data clouds and fuzzy models for weather time series prediction. Appl Soft Comput 64:445–453

    Article  Google Scholar 

  17. Zamora-Martínez F, Romeu P, Botella-Rocamora P, Pardo J (2014) On-line learning of indoor temperature forecasting models towards energy efficiency. Energy Build 83:162–172

    Article  Google Scholar 

  18. Cornacchia M, Ozcan K, Zheng Y, Velipasalar S (2016) A survey on activity detection and classification using wearable sensors. IEEE Sens J 17(2):386–403

    Article  Google Scholar 

  19. Candanedo LM, Feldheim V, Deramaix D (2017) Data driven prediction models of energy use of appliances in a low-energy house. Energy Build 140:81–97

    Article  Google Scholar 

  20. Wang Y, Li H (2020) Industrial process time-series modeling based on adapted receptive field temporal convolution networks concerning multi-region operations. In: Computers and chemical engineering. p 106877

  21. Liang Y, Ke S, Zhang J, Yi X, Zheng Y (2018)“Geoman: Multi-level attention networks for geo-sensory time series prediction.” in IJCAI, pp. 3428–3434

  22. Hao H, Wang Y, Xia Y, Zhao J, Shen F (2020) Temporal convolutional attention-based network for sequence modeling. arXiv:2002.12530

  23. Box GE, Pierce DA (1970) Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. J Am Stat Assoc 65(332):1509–1526

    Article  MathSciNet  MATH  Google Scholar 

  24. Van Gestel T, Suykens JA, Baestaens D-E, Lambrechts A, Lanckriet G, Vandaele B, De Moor B, Vandewalle J (2001) Financial time series prediction using least squares support vector machines within the evidence framework. IEEE Trans Neural Netw 12(4):809–821

    Article  Google Scholar 

  25. Han M, Xu M (2017) Laplacian echo state network for multivariate time series prediction. IEEE Trans Neural Netw Learn Syst 29(1):238–244

    Article  MathSciNet  Google Scholar 

  26. Sivakumar S, Sivakumar S (2017) Marginally stable triangular recurrent neural network architecture for time series prediction. IEEE Trans Cybern 99:1–15

    Google Scholar 

  27. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press, London

    MATH  Google Scholar 

  28. Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166

    Article  Google Scholar 

  29. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  30. Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:1406.1078.

  31. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In Advances in neural information processing systems. pp 5998–6008

  32. LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551

    Article  Google Scholar 

  33. Fan J, Wang H, Huang y, Zhang K, Zhao B (2020) Aedmts: an attention-based encoder-decoder framework for multi-sensory time series analytic. In: IEEE access, vol PP. pp 1–1, 02

  34. Huang S, Wang D, Wu X, Tang A (2019) Dsanet: dual self-attention network for multivariate time series forecasting. In: Proceedings of the 28th ACM international conference on information and knowledge management. pp 2129–2132

  35. Zhu J, Ge Z, Song Z, Gao F (2018) Review and big data perspectives on robust data mining approaches for industrial process modeling with outliers and missing data. Ann Rev Control 46:107–133

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by a Grant from The National Natural Science Foundation of China (No. U1609211), National Key Research and Development Project (2019YFB1705100). The corresponding author is Baiping Chen.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Baiping Chen.

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, J., Zhang, K., Huang, Y. et al. Parallel spatio-temporal attention-based TCN for multivariate time series prediction. Neural Comput & Applic 35, 13109–13118 (2023). https://doi.org/10.1007/s00521-021-05958-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-05958-z

Keywords

Navigation