Skip to main content

Neural Additive Vector Autoregression Models for Causal Discovery in Time Series

  • Conference paper
  • First Online:
Discovery Science (DS 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12986))

Included in the following conference series:


Causal structure discovery in complex dynamical systems is an important challenge for many scientific domains. Although data from (interventional) experiments is usually limited, large amounts of observational time series data sets are usually available. Current methods that learn causal structure from time series often assume linear relationships. Hence, they may fail in realistic settings that contain nonlinear relations between the variables. We propose Neural Additive Vector Autoregression (NAVAR) models, a neural approach to causal structure learning that can discover nonlinear relationships. We train deep neural networks that extract the (additive) Granger causal influences from the time evolution in multi-variate time series. The method achieves state-of-the-art results on various benchmark data sets for causal discovery, while providing clear interpretations of the mapped causal relations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions


  1. 1.

    Appendices and code can be found at


  1. Abbasvandi, Z., Nasrabadi, A.M.: A self-organized recurrent neural network for estimating the effective connectivity and its application to EEG data. Comput. Biol. Med. 110, 93–107 (2019)

    Article  Google Scholar 

  2. Agarwal, R., Frosst, N., Zhang, X., Caruana, R., Hinton, G.E.: Neural additive models: Interpretable machine learning with neural nets. arXiv preprint arXiv:2004.13912 (2020)

  3. Baek, E., Brock, W.: A general test for nonlinear granger causality: bivariate model. In: Iowa State University and University of Wisconsin at Madison Working Paper (1992)

    Google Scholar 

  4. Bengio, Y., et al.: A meta-transfer objective for learning to disentangle causal mechanisms (2019). arXiv preprint arXiv:1901.10912

  5. Bongers, S., Mooij, J.M.: From random differential equations to structural causal models: the stochastic case. arXiv preprint arXiv:1803.08784 (2018)

  6. Bühlmann, P., Peters, J., Ernest, J., et al.: Cam: causal additive models, high-dimensional order search and penalized regression. Ann. Stat. 42(6), 2526–2556 (2014)

    Article  MathSciNet  Google Scholar 

  7. Chen, Y., Bressler, S.L., Ding, M.: Frequency decomposition of conditional granger causality and application to multivariate neural field potential data. J. Neurosci. Methods 150(2), 228–237 (2006)

    Article  Google Scholar 

  8. Duggento, A., Guerrisi, M., Toschi, N.: Echo state network models for nonlinear granger causality. bioRxiv, pp. 651–679 (2019)

    Google Scholar 

  9. Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12(10), 2451–2471 (2000)

    Article  Google Scholar 

  10. Granger, C.W.J.: Investigating causal relations by econometric models and cross-spectral methods. Econometrica J. Econometric Soc., pp. 424–438 (1969)

    Google Scholar 

  11. Hooker, G., Mentch, L.: Please stop permuting features: an explanation and alternatives. arXiv preprint arXiv:1905.03151 (2019)

  12. Kalainathan, D., Goudet, O., Guyon, I., Lopez-Paz, D., Sebag, M.: Sam: structural agnostic model, causal discovery and penalized adversarial learning. arXiv preprint arXiv:1803.04929 (2018)

  13. Ke, N.R., et al.: Learning neural causal models from unknown interventions. arXiv preprint arXiv:1910.01075 (2019)

  14. Khanna, S., Tan, V.F.A.: Economy statistical recurrent units for inferring nonlinear granger causality. arXiv preprint arXiv:1911.09879 (2019)

  15. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  16. Lachapelle, S., Brouillard, P., Deleu, T., Lacoste-Julien, S.: Gradient-based neural dag learning. arXiv preprint arXiv:1906.02226 (2019)

  17. Marinazzo, D., Liao, W., Chen, H., Stramaglia, S.: Nonlinear connectivity by granger causality. Neuroimage 58(2), 330–338 (2011)

    Article  Google Scholar 

  18. Muñoz-Marí, J., Mateo, G., Runge, J., Camps-Valls, G.: Causeme: an online system for benchmarking causal discovery methods. In: Preparation (2020)

    Google Scholar 

  19. Nauta, M., Bucur, D., Seifert, C.: Causal discovery with attention-based convolutional neural networks. Mach. Learn. Knowl. Extract. 1(1), 312–340 (2019)

    Article  Google Scholar 

  20. Papana, A., Kyrtsou, C., Kugiumtzis, D., Diks, C.: Detecting causality in non-stationary time series using partial symbolic transfer entropy: evidence in financial data. Comput. Econ. 47(3), 341–365 (2016)

    Article  Google Scholar 

  21. Pearl, J.: Causal diagrams for empirical research. Biometrika 82(4), 669–688 (1995)

    Article  MathSciNet  Google Scholar 

  22. Peters, J., Janzing, D., Schölkopf, B.: Causal inference on time series using restricted structural equation models. Adv. Neural Inf. Process. Syst. 26, 154–162 (2013)

    Google Scholar 

  23. Peters, J., Mooij, J.M., Janzing, D., Schölkopf, B.: Causal discovery with continuous additive noise models. J. Mach. Learn. Res. 15(1), 2009–2053 (2014)

    MathSciNet  MATH  Google Scholar 

  24. Peters, J., Janzing, D., Schölkopf, B.: Elements of Causal Inference: Foundations and Learning Algorithms. MIT press, Cambridge (2017)

    MATH  Google Scholar 

  25. Potts, W.J.E.: Generalized additive neural networks. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 194–200 (1999)

    Google Scholar 

  26. Prill, R.J., et al.: Towards a rigorous assessment of systems biology models: the dream3 challenges. PloS one 5(2), e9202 (2010)

    Article  Google Scholar 

  27. Runge, J.: Causal network reconstruction from time series: from theoretical assumptions to practical estimation. Chaos Interdisc. J. Nonlinear Sci. 28(7), 075310 (2018)

    Article  MathSciNet  Google Scholar 

  28. Runge, J., et al.: Inferring causation from time series in earth system sciences. Nat. Commun. 10(1), 1–13 (2019)

    Article  Google Scholar 

  29. Runge, J., Nowack, P., Kretschmer, M., Flaxman, S., Sejdinovic, D.: Detecting and quantifying causal associations in large nonlinear time series datasets. Sci. Adv. 5(11), eaau4996 (2019)

    Google Scholar 

  30. Seabold, S., Perktold, J.: Statsmodels: econometric and statistical modeling with python. In: Proceedings of the 9th Python in Science Conference, vol. 57, p. 61. Scipy (2010)

    Google Scholar 

  31. Stephan, K.E., et al.: Nonlinear dynamic causal models for fmri. Neuroimage 42(2), 649–662 (2008)

    Article  Google Scholar 

  32. Tank, A., Covert, I., Foti, N., Shojaie, A., Fox, E.: Neural granger causality for nonlinear time series. Stat 1050, 16 (2018)

    Google Scholar 

  33. Wang, Y., et al.: Estimating brain connectivity with varying-length time lags using a recurrent neural network. IEEE Trans. Biomed. Eng. 65(9), 1953–1963 (2018)

    Article  Google Scholar 

  34. Weichwald, S., Jakobsen, M.E., Mogensen, P.B., Petersen, L., Thams, N., Varando, G.: Causal structure learning from time series: Large regression coefficients may predict causal links better in practice than small p-values. arXiv preprint arXiv:2002.09573 (2020)

  35. Zheng, X., Aragam, B., Ravikumar, P.K., Xing, E.P.: Dags with no tears: continuous optimization for structure learning. In: Advances in Neural Information Processing Systems, pp. 9472–9483 (2018)

    Google Scholar 

  36. Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc 101(476), 1418–1429 (2006)

    Article  MathSciNet  Google Scholar 

Download references


This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 813114.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Bart Bussmann .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bussmann, B., Nys, J., Latré, S. (2021). Neural Additive Vector Autoregression Models for Causal Discovery in Time Series. In: Soares, C., Torgo, L. (eds) Discovery Science. DS 2021. Lecture Notes in Computer Science(), vol 12986. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88941-8

  • Online ISBN: 978-3-030-88942-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics