Skip to main content
Log in

TCLN: A Transformer-based Conv-LSTM network for multivariate time series forecasting

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The study of multivariate time series forecasting (MTSF) problems has high significance in many areas, such as industrial forecasting and traffic flow forecasting. Traditional forecasting models pay more attention to the temporal features of variables and lack depth in extracting spatial and spatiotemporal features between variables. In this paper, a novel model based on the Transformer, convolutional neural network (CNN), and long short-term memory (LSTM) network is proposed to address the issues. The model first extracts the spatial feature vectors through the proposed Multi-kernel CNN. Then it fully extracts the temporal information by the Encoder layer that consists of the Transformer encoder layer and the LSTM network, which can also obtain the potential spatiotemporal correlation. To extract more feature information, we stack multiple Encoder layers. Finally, the output is decoded by the Decoder layer composed of the ReLU activation function and the Linear layer. To further improve the model’s robustness, we also integrate an autoregressive model. In model evaluation, the proposed model achieves significant performance improvements over the current benchmark methods for MTSF tasks on four datasets. Further experiments demonstrate that the model can be used for long-horizon forecasting and achieve satisfactory results on the yield forecasting of test items (our private dataset, TIOB).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Algorithm 1
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data Availability

The NASDAQ100 Stock dataset and SML2010 dataset can be obtained from [12]. The Solar Energy dataset can be obtained from [24]. The TIOB dataset will not be shared for it involves the privacy of the factory.

References

  1. Prakhar K, Sountharrajan S, Suganya E, Karthiga M, Kumar S (2022) Effective stock price prediction using time series forecasting. In: 6th International Conference on Trends in Electronics and Informatics (ICOEI) pp 1636–1640

  2. Venkatachalam K, Trojovský P, Pamucar D, Bacanin N, Simic V (2023) DWFH: An improved data-driven deep weather forecasting hybrid model using transductive long short term memory (T-LSTM). Expert Syst Appl 213 (Part), 119270. https://doi.org/10.1016/j.eswa.2022.119270

  3. Guo S, Lin Y, Feng N, Song C, Wan H (2019) Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In: The thirty-third AAAI conference on artificial intelligence, pp 922–929. https://doi.org/10.1609/aaai.v33i01.3301922

  4. Gao H, Su H, Cai Y, Wu R, Hao Z, Xu Y, Wu W, Wang J, Li Z, Kan Z (2021) Trajectory prediction of cyclist based on dynamic bayesian network and long short-term memory model at unsignalized intersections. Science China Information Sciences 64(7):172207. https://doi.org/10.1007/s11432-020-3071-8

    Article  Google Scholar 

  5. Shi H, Zhu J, Kuang M, Yuan X (2021) Cooperative prediction guidance law in target-attacker-defender scenario. Sci China Inf Sci 64(4):149201. https://doi.org/10.1007/s11432-018-9806-7

    Article  Google Scholar 

  6. Gefang D, Koop G, Poon A (2023) Forecasting using variational Bayesian inference in large vector autoregressions with hierarchical shrinkage. Int J Forecast 39(1):346–363

    Article  Google Scholar 

  7. Zhang B, Chan JCC, Cross JL (2020) Stochastic volatility models with ARMA innovations: An application to G7 inflation forecasts. Int J Forecast 36(4):1318–1328

    Article  Google Scholar 

  8. Khajavi H, Rastgoo A (2023) Improving the prediction of heating energy consumed at residential buildings using a combination of support vector regression and meta-heuristic algorithms. Energy 272:127069. https://doi.org/10.1016/j.energy.2023.127069

    Article  Google Scholar 

  9. Swathi T, Kasiviswanath N, Rao AA (2022) An optimal deep learning-based LSTM for stock price prediction using twitter sentiment analysis. Appl Intell 52(12):13675–13688. https://doi.org/10.1007/s10489-022-03175-2

    Article  Google Scholar 

  10. Bengio Y, Simard PY, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166. https://doi.org/10.1109/72.279181

    Article  Google Scholar 

  11. Xiao Y, Yin H, Zhang Y, Qi H, Zhang Y, Liu Z (2021) A dual-stage attention-based Conv-LSTM network for spatio-temporal correlation and multivariate time series prediction. Int J Intell Syst 36(5):2036–2057. https://doi.org/10.1002/int.22370

    Article  Google Scholar 

  12. Qin Y, Song D, Chen H, Cheng W, Jiang G, Cottrell GW (2017) A dual-stage attention-based recurrent neural network for time series prediction. In: IJCAI, pp 2627–2633. https://doi.org/10.24963/ijcai.2017/366

  13. Fu E, Zhang Y, Yang F, Wang S (2022) Temporal self-attention-based Conv-LSTM network for multivariate time series prediction. Neurocomput 501:162–173. https://doi.org/10.1016/j.neucom.2022.06.014

    Article  Google Scholar 

  14. Vaswani A,Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Proc Syst 30

  15. Nascimento EGS, de Melo TAC, Moreira DM (2023) A transformer-based deep neural network with wavelet transform for forecasting wind speed and wind energy. Energy 278:127678. https://doi.org/10.1016/j.energy.2023.127678

    Article  Google Scholar 

  16. Zerveas G, Jayaraman S, Patel D, Bhamidipaty A, Eickhoff C (2021) A transformer-based framework for multivariate time series representation learning. In: KDD ’21: The 27th ACM SIGKDD conference on knowledge discovery and data mining pp 2114–2124. https://doi.org/10.1145/3447548.3467401

  17. Fu X, Guo Q, Sun H (2020) Statistical machine learning model for stochastic optimal planning of distribution networks considering a dynamic correlation and dimension reduction. IEEE Transactions on Smart Grid 11(4):2904–2917. https://doi.org/10.1109/TSG.2020.2974021

    Article  Google Scholar 

  18. Fu X (2022) Statistical machine learning model for capacitor planning considering uncertainties in photovoltaic power. Protect Contr Mod Power Syst 7(1):5. https://doi.org/10.1186/s41601-022-00228-z

    Article  MathSciNet  Google Scholar 

  19. Pan S, Long S, Wang Y, Xie Y (2023) Nonlinear asset pricing in Chinese stock market: A deep learning approach. Int Rev Fin Anal 87:102627. https://doi.org/10.1016/j.irfa.2023.102627

    Article  Google Scholar 

  20. Mohimont L, Chemchem A, Alin F, Krajecki M, Steffenel LA (2021) Convolutional neural networks and temporal CNNs for COVID-19 forecasting in France. Appl Intell 51(12):8784–8809. https://doi.org/10.1007/s10489-021-02359-6

    Article  Google Scholar 

  21. Banerjee T, Sinha S, Choudhury P (2022) Long term and short term forecasting of horticultural produce based on the LSTM network model. Appl Intell 52(8):9117–9147. https://doi.org/10.1007/s10489-021-02845-x

    Article  Google Scholar 

  22. Li G, Zhong X (2023) Parking demand forecasting based on improved complete ensemble empirical mode decomposition and GRU model. Eng Appl Artif Intell 119:105717. https://doi.org/10.1016/j.engappai.2022.105717

    Article  Google Scholar 

  23. Xu W, Peng H, Zeng X, Zhou F, Tian X, Peng X (2019) A hybrid modelling method for time series forecasting based on a linear regression model and deep learning. Appl Intell 49(8):3002–3015. https://doi.org/10.1007/s10489-019-01426-3

    Article  Google Scholar 

  24. Lai G, Chang W, Yang Y, Liu H (2018) Modeling long- and short-term temporal patterns with deep neural networks. In: The 41st international ACM SIGIR conference on research & development in information retrieval pp 95–104. https://doi.org/10.1145/3209978.3210006

  25. Yang Y, Lu J (2022) Foreformer: an enhanced transformer-based framework for multivariate time series forecasting. Appl Intell 1–20

  26. Chen Z, Chen D, Zhang X, Yuan Z, Cheng X (2022) Learning graph structures with transformer for multivariate time-series anomaly detection in IoT. IEEE internet things J 9(12):9179–9189. https://doi.org/10.1109/JIOT.2021.3100509

    Article  Google Scholar 

  27. Cao D, Wang Y, Duan J, Zhang C, Zhu X, Huang C, Tong Y, Xu B, Bai J, Tong J et al (2020) Spectral temporal graph neural network for multivariate time-series forecasting. Adv Neural Inf Proc Sys 33:17766–17778

    Google Scholar 

  28. Shang C, Chen J, Bi J (2021) Discrete graph structure learning for forecasting multiple time series. In: 9th international conference on learning representations. https://openreview.net/forum?id=WEHSlH5mOk

  29. Wu Z, Pan S, Long G, Jiang J, Chang X, Zhang C (2020) Connecting the dots: multivariate time series forecasting with graph neural networks. In: KDD ’20: The 26th ACM SIGKDD conference on knowledge discovery and data mining pp 753–763. https://doi.org/10.1145/3394486.3403118

  30. Fu X, Zhou Y (2023) Collaborative optimization of PV greenhouses and clean energy systems in rural areas. IEEE transactions on sustainable energy 14(1):642–656. https://doi.org/10.1109/TSTE.2022.3223684

    Article  Google Scholar 

  31. Huang X, Tang J, Yang X, Xiong L (2022) A time-dependent attention convolutional LSTM method for traffic flow prediction. Appl Intell 52(15):17371–17386. https://doi.org/10.1007/s10489-022-03324-7

    Article  Google Scholar 

  32. Ren Q, Li Y, Liu Y (2023) Transformer-enhanced periodic temporal convolution network for long short-term traffic flow forecasting. Expert Syst Appl 227:120203. https://doi.org/10.1016/j.eswa.2023.120203

    Article  Google Scholar 

  33. Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE conference on computer vision and pattern recognition pp 1–9. https://doi.org/10.1109/CVPR.2015.7298594

  34. Shih S-Y, Sun F-K, Lee H-y (2019) Temporal pattern attention for multivariate time series forecasting. Mach Learn 108:1421–1441

    Article  MathSciNet  MATH  Google Scholar 

  35. Cheng Q, Chen Y, Xiao Y, Yin H, Liu W (2022) A dual-stage attention-based Bi-LSTM network for multivariate time series prediction. J Supercomput 78(14):16214–16235. https://doi.org/10.1007/s11227-022-04506-3

    Article  Google Scholar 

  36. Wang Q, Chen L, Zhao J, Wang W (2020) A deep granular network with adaptive unequal-length granulation strategy for long-term time series forecasting and its industrial applications. Artif Intell Rev 53(7):5353–5381. https://doi.org/10.1007/s10462-020-09822-9

    Article  Google Scholar 

Download references

Funding

This work was supported by the National Natural Science Foundation of China (No. 62173317) and the Key Research and Development Program of Anhui (No. 202104a05020064).

Author information

Authors and Affiliations

Authors

Contributions

Shusen Ma establishes the model, conducts the most of the experiments, and completes the writing of this paper. Tianhao Zhang mainly realizes the initial model and conducts partial supplementary experiments. Yun-Bo Zhao mainly supervises the work. Yu Kang and Peng Bai participate in the discussion of the work.

Corresponding author

Correspondence to Yun-Bo Zhao.

Ethics declarations

Ethics approval

Not applicable.

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, S., Zhang, T., Zhao, YB. et al. TCLN: A Transformer-based Conv-LSTM network for multivariate time series forecasting. Appl Intell 53, 28401–28417 (2023). https://doi.org/10.1007/s10489-023-04980-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04980-z

Keywords

Navigation