Spatio-temporal neural networks for space-time data modeling and relation discovery


We introduce a dynamical spatio-temporal model formalized as a recurrent neural network for modeling time series of spatial processes, i.e., series of observations sharing temporal and spatial dependencies. The model learns these dependencies through a structured latent dynamical component, while a decoder predicts the observations from the latent representations. We consider several variants of this model, corresponding to different prior hypothesis about the spatial relations between the series. The model is used for the tasks of forecasting and data imputation. It is evaluated and compared to state-of-the-art baselines, on a variety of forecasting and imputation problems representative of different application areas: epidemiology, geo-spatial statistics, and car traffic prediction. The experiments also show that this approach is able to learn relevant spatial relations without prior information.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17


  1. 1.

    We assume that all the series have the same dimensionality and length. This is often the case for spatio-temporal problems otherwise this restriction can be easily removed.

  2. 2.

    In the experiments, we used the Nesterov’s Accelerated Gradient (NAG) method [36].

  3. 3.

    Code available at

  4. 4.

    We also performed tests with LSTM and obtained similar results as with GRU.


  1. 1.

    Bahadori MT, Yu QR, Liu Y (2014) Fast multivariate spatio-temporal analysis via low rank tensor learning. In: Advances in neural information processing systems, pp 3491–3499

  2. 2.

    Bańbura M, Modugno M (2014) Maximum likelihood estimation of factor models on datasets with arbitrary pattern of missing data. J Appl Econom 29(1):133–160

    MathSciNet  Article  Google Scholar 

  3. 3.

    Bayer J, Osendorfer C (2014) Learning stochastic recurrent networks. arXiv preprint arXiv:1411.7610

  4. 4.

    Ben Taieb S, Hyndman R (2014) Boosting multi-step autoregressive forecasts. In: Proceedings of the 31st international conference on machine learning, pp 109–117

  5. 5.

    Bengio Y (2008) Neural net language models. Scholarpedia 3(1):3881

    Article  Google Scholar 

  6. 6.

    Ceci M, Corizzo R, Fumarola F, Malerba D, Rashkovska A (2017) Predictive modeling of pv energy production: How to set up the learning task for a better prediction? IEEE Trans Ind Inf 13(3):956–966

    Article  Google Scholar 

  7. 7.

    Che Z, Purushotham S, Cho K, Sontag D, Liu Y (2016) Recurrent neural networks for multivariate time series with missing values. arXiv preprint arXiv:1606.01865

  8. 8.

    Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. arXiv preprint arXiv:1406.1078

  9. 9.

    Chung J, Kastner K, Dinh L, Goel K, Courville AC, Bengio Y (2015a) A recurrent latent variable model for sequential data. In: Advances in neural information processing systems, pp 2962–2970

  10. 10.

    Chung J, Kastner K, Dinh L, Goel K, Courville AC, Bengio Y (2015b) A recurrent latent variable model for sequential data. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems, vol 28. Curran Associates, Inc., Red Hook, pp 2980–2988

    Google Scholar 

  11. 11.

    Connor JT, Martin RD, Atlas LE (1994) Recurrent neural networks and robust time series prediction. IEEE Trans Neural Netw 5(2):240–254

    Article  Google Scholar 

  12. 12.

    Cressie NAC, Wikle CK (2011) Statistics for spatio-temporal data, Wiley series in probability and statistics. Wiley, Hoboken

    Google Scholar 

  13. 13.

    de Bezenac E, Pajot A, Gallinari P (2017) Deep learning for physical processes: incorporating prior scientific knowledge. In: Proceedings of the international conference on learning representations (ICLR)’

  14. 14.

    De Gooijer JG, Hyndman RJ (2006) 25 years of time series forecasting. Int J Forecast 22(3):443–473

    Article  Google Scholar 

  15. 15.

    Denton EL, Birodkar v (2017) Unsupervised learning of disentangled representations from video. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems 30. Curran Associates, Inc., pp 4417–4426

  16. 16.

    Dornhege G, Blankertz B, Krauledat M, Losch F, Curio G, robert Muller K (2005) Optimizing spatio-temporal filters for improving brain-computer interfacing. In: in NIPS

  17. 17.

    Ganeshapillai G, Guttag J, Lo A (2013) Learning connections in financial time series. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 109–117

  18. 18.

    Graves A, Mohamed A-R, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: IIIE ICASSP

  19. 19.

    Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  20. 20.

    Honaker J, King G, Blackwell Matthew EA (2011) Amelia II: a program for missing data. J Stat Softw 45(7):1–47

    Article  Google Scholar 

  21. 21.

    Kalchbrenner N, Oord Avd, Simonyan K, Danihelka I, Vinyals O, Graves A, Kavukcuoglu K (2017) Unsupervised learning of video representations using lstms. In: Proceedings of the 34nd ICML-17

  22. 22.

    Kingma DP, Welling M (2013) Auto-encoding variational bayes. In: Proceedings of the 2nd international conference on learning representations (ICLR)

  23. 23.

    Koppula H, Saxena A (2013) Learning spatio-temporal structure from RGB-d videos for human activity detection and anticipation. In: Proceedings of ICML

  24. 24.

    Krishnan RG, Shalit U, Sontag D (2015) Deep kalman filters. arXiv preprint arXiv:1511.05121

  25. 25.

    Li Y, Tarlow D, Brockschmidt M, Zemel R (2015) Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493

  26. 26.

    Mirowski P, LeCun Y (2009) Dynamic factor graphs for time series modeling. In: Buntine W, Grobelnik M, Mladenić D, Shawe-Taylor J (eds) Machine learning and knowledge discovery in databases. Springer, Berlin, Heidelberg, pp 128–143

    Google Scholar 

  27. 27.

    Muller KR, Smola AJ, Ratsch G, Scholkopf B, Kohlmorgen J, Vapnik V (1999) Using support vector machines for time series prediction. Advances in kernel methods—support vector learning

  28. 28.

    Oord AV, Kalchbrenner N, Kavukcuoglu K (2016) Pixel recurrent neural networks, In: Balcan MF, Weinberger KQ (eds) Proceedings of The 33rd international conference on machine learning, vol. 48 of proceedings of machine learning research. PMLR, pp 1747–1756

  29. 29.

    Ren Y, Wu Y (2014) Convolutional deep belief networks for feature extraction of EEG signal. In: 2014 international joint conference on neural networks (IJCNN). IEEE, pp 2850–2853

  30. 30.

    Rue H, Held L (2005) Gaussian Markov random fields: theory and applications. CRC Press, Boca Raton

    Google Scholar 

  31. 31.

    Shang J, Zheng Y, Tong W, Chang E, Yu Y (2014) Inferring gas consumption and pollution emission of vehicles throughout a city. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1027–1036

  32. 32.

    Shi W, Zhu Y, Philip SY, Huang T, Wang C, Mao Y, Chen Y (2016) Temporal dynamic matrix factorization for missing data prediction in large scale coevolving time series. IEEE Access 4:6719–6732

    Article  Google Scholar 

  33. 33.

    Shi X, Chen Z, Wang H, Yeung D-Y, Wong W-k, Woo W-c (2015) Convolutional lstm network: a machine learning approach for precipitation nowcasting. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems, vol 28. Curran Associates, Inc., Red Hook, pp 802–810

    Google Scholar 

  34. 34.

    Song Y, Liu M, Tang S, Mao X (2012) Time series matrix factorization prediction of internet traffic matrices. In: 2012 IEEE 37th conference on local computer networks (LCN). IEEE, pp 284–287

  35. 35.

    Srivastava N, Mansimov E, Salakhudinov R (2015) Unsupervised learning of video representations using lstms. In: Blei D, Bach F (eds) Proceedings of the 32nd ICML-15

  36. 36.

    Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. In: Proceedings of the 30th ICML

  37. 37.

    Sutskever I, Martens J, Hinton GE (2011) Generating text with recurrent neural networks. In: Proceedings of the 28th international conference on machine learning, ICML 2011

  38. 38.

    Wikle CK (2015) Modern perspectives on statistics for spatio-temporal data’. Wiley Interdiscip Rev Comput Stat 7(1):86–98

    MathSciNet  Article  Google Scholar 

  39. 39.

    Wikle CK, Hooten MB (2010) A general science-based framework for dynamical spatio-temporal models’. Test 19(3):417–451

    MathSciNet  Article  Google Scholar 

  40. 40.

    Yuan J, Zheng Y, Xie X, Sun G (2011) Driving with knowledge from the physical world. In: Proceedings of the 17th ACM SIGKDD. ACM, pp 316–324

  41. 41.

    Yuan J, Zheng Y, Zhang C, Xie W, Xie X, Sun G, Huang Y (2010) T-drive: driving directions based on taxi trajectories. In: Proceedings of the 18th SIGSPATIAL. ACM, pp 99–108

Download references


Locust Project ANR-15-CE23-0027-01, funded by Agence Nationale de la Recherche.

Author information



Corresponding author

Correspondence to Edouard Delasalles.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Delasalles, E., Ziat, A., Denoyer, L. et al. Spatio-temporal neural networks for space-time data modeling and relation discovery. Knowl Inf Syst 61, 1241–1267 (2019).

Download citation


  • Time series
  • Spatio-temporal
  • Forecasting
  • Data imputation
  • Deep learning
  • Neural networks