Skip to main content
Log in

SPADE4: Sparsity and Delay Embedding Based Forecasting of Epidemics

Bulletin of Mathematical Biology Aims and scope Submit manuscript

Cite this article


Predicting the evolution of diseases is challenging, especially when the data availability is scarce and incomplete. The most popular tools for modelling and predicting infectious disease epidemics are compartmental models. They stratify the population into compartments according to health status and model the dynamics of these compartments using dynamical systems. However, these predefined systems may not capture the true dynamics of the epidemic due to the complexity of the disease transmission and human interactions. In order to overcome this drawback, we propose Sparsity and Delay Embedding based Forecasting (SPADE4) for predicting epidemics. SPADE4 predicts the future trajectory of an observable variable without the knowledge of the other variables or the underlying system. We use random features model with sparse regression to handle the data scarcity issue and employ Takens’ delay embedding theorem to capture the nature of the underlying system from the observed variable. We show that our approach outperforms compartmental models when applied to both simulated and real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Data Availability

The code is available at and data sources are COVID-19 in Canada: Ebola in Guinea: Zika virus in Giradot: Rojas et al. (2016) Influenza A/H7N9 in China:






  • Althaus CL (2014) Estimating the reproduction number of Ebola virus (EBOV) during the 2014 outbreak in West Africa. PLoS currents 6

  • Ayed I, de Bézenac E, Pajot A, Brajard J, Gallinari P (2019) Learning dynamical systems from partial observations. arXiv preprint arXiv:1902.11136

  • Bhattacharya K, Hosseini B, Kovachki N.B, Stuart A.M (2020) Model reduction and neural networks for parametric PDEs. arXiv preprint arXiv:2005.03180

  • Blum MG, Tran VC (2010) HIV with contact tracing: a case study in approximate Bayesian computation. Biostatistics 11(4):644–660

  • Broomhead DS, King GP (1986) Extracting qualitative dynamics from experimental data. Physica D 20(2–3):217–236

  • Brunton SL, Proctor JL, Kutz JN (2016) Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc Natl Acad Sci 113(15):3932–3937

    Article  MathSciNet  MATH  Google Scholar 

  • Brunton SL, Brunton BW, Proctor JL, Kaiser E, Kutz JN (2017) Chaos as an intermittently forced linear system. Nat Commun 8(1):1–9

    Article  Google Scholar 

  • Cauchemez S, Ferguson NM (2008) Likelihood-based estimation of continuous-time epidemic models from time-series data: application to measles transmission in London. J R Soc Interface 5(25):885–897

    Article  Google Scholar 

  • Champion KP, Brunton SL, Kutz JN (2019) Discovery of nonlinear multiscale systems: Sampling strategies and embeddings. SIAM J Appl Dyn Syst 18(1):312–333

    Article  MathSciNet  MATH  Google Scholar 

  • Cramer EY, Ray EL, Lopez VK, Bracher J, Brennen A, Castro Rivadeneira AJ, Gerding A, Gneiting T, House KH, Huang Y et al (2022) Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the United States. Proc Natl Acad Sci 119(15):2113561119

    Article  Google Scholar 

  • Dukic V, Lopes HF, Polson NG (2012) Tracking epidemics with Google flu trends data and a state-space SEIR model. J Am Stat Assoc 107(500):1410–1426

    Article  MathSciNet  MATH  Google Scholar 

  • Girardi P, Gaetan C (2023) An SEIR model with time-varying coefficients for analyzing the SARS-CoV-2 Epidemic. Risk Anal 43(1):144–155

    Article  Google Scholar 

  • González-García R, Rico-Martìnez R, Kevrekidis IG (1998) Identification of distributed parameter systems: A neural net based approach. Comput Chem Eng 22:965–968

    Article  Google Scholar 

  • Goyal P, Benner P (2022) Discovery of nonlinear dynamical systems using a Runge-Kutta inspired dictionary-based sparse regression approach. Proc R Soc A 478(2262):20210883

    Article  MathSciNet  Google Scholar 

  • Hashemi A, Schaeffer H, Shi R, Topcu U, Tran G, Ward R (2023) Generalization bounds for sparse random feature expansions. Appl Comput Harmon Anal 62:310–330

    Article  MathSciNet  MATH  Google Scholar 

  • Ho LST, Xu J, Crawford FW, Minin VN, Suchard MA (2018) Birth/birth-death processes and their computable transition probabilities with biological applications. J Math Biol 76(4):911–944

    Article  MathSciNet  MATH  Google Scholar 

  • Ho LST, Crawford FW, Suchard MA (2018) Direct likelihood-based inference for discretely observed stochastic compartmental models of infectious disease. Ann Appl Stat 12(3):1993–2021

    Article  MathSciNet  MATH  Google Scholar 

  • Huke J (2006) Embedding nonlinear dynamical systems: A guide to Takens’ theorem

  • Jacot A, Simsek B, Spadaro F, Hongler C, Gabriel F (2020) Implicit regularization of random feature models. In: International Conference on Machine Learning, pp 4631–4640 . PMLR

  • Juang J-N, Pappa RS (1985) An eigensystem realization algorithm for modal parameter identification and model reduction. J Guid Control Dyn 8(5):620–627

    Article  MATH  Google Scholar 

  • Kaiser E, Kutz JN, Brunton SL (2018) Sparse identification of nonlinear dynamics for model predictive control in the low-data limit. Proc R Soc A 474(2219):20180335

    Article  MathSciNet  MATH  Google Scholar 

  • Kamb M, Kaiser E, Brunton SL, Kutz JN (2020) Time-delay observables for Koopman: Theory and applications. SIAM J Appl Dyn Syst 19(2):886–917

    Article  MathSciNet  MATH  Google Scholar 

  • Lagaris IE, Likas A, Fotiadis DI (1998) Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans Neural Netw 9(5):987–1000

    Article  Google Scholar 

  • Le Clainche S, Vega JM (2017) Higher order dynamic mode decomposition. SIAM J Appl Dyn Syst 16(2):882–925

    Article  MathSciNet  MATH  Google Scholar 

  • Li Z, Kovachki N, Azizzadenesheli K, Liu B, Bhattacharya K, Stuart A, Anandkumar A (2020) Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895

  • Lin A.T, Eckhardt D, Martin R, Osher S, Wong A.S (2022) Parameter inference of time series by delay embeddings and learning differentiable operators. arXiv preprint arXiv:2203.06269

  • Lu L, Jin P, Karniadakis G.E (2019) Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193

  • Lu L, Meng X, Mao Z, Karniadakis GE (2021) Deepxde: A deep learning library for solving differential equations. SIAM Rev 63(1):208–228

    Article  MathSciNet  MATH  Google Scholar 

  • Lusch B, Kutz JN, Brunton SL (2018) Deep learning for universal linear embeddings of nonlinear dynamics. Nat Commun 9(1):1–10

    Article  Google Scholar 

  • Mangan NM, Kutz JN, Brunton SL, Proctor JL (2017) Model selection for dynamical systems via sparse regression and information criteria. Proc R Soc A: Math, Phys Eng Sci 473(2204):20170009

    Article  MathSciNet  MATH  Google Scholar 

  • Narendra KS, Parthasarathy K (1992) Neural networks and dynamical systems. Int J Approx Reason 6(2):109–131

    Article  MATH  Google Scholar 

  • Nelsen NH, Stuart AM (2021) The random feature model for input-output maps between Banach spaces. SIAM J Sci Comput 43(5):3212–3243

    Article  MathSciNet  MATH  Google Scholar 

  • Qin T, Wu K, Xiu D (2019) Data driven governing equations approximation using deep neural networks. J Comput Phys 395:620–635

    Article  MathSciNet  MATH  Google Scholar 

  • Rahimi A, Recht B (2007) Random features for large-scale kernel machines. In: NIPS, vol. 3, p. 5 . Citeseer

  • Rahimi A, Recht B (2008a) Uniform approximation of functions with random bases. In: 2008 46th Annual Allerton Conference on Communication, Control, and Computing, IEEE, pp 555–561

  • Rahimi A, Recht B (2008b) Weighted sums of random kitchen sinks: replacing minimization with randomization in learning. In: NIPS, pp 1313–1320 . Citeseer

  • Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys 378:686–707

    Article  MathSciNet  MATH  Google Scholar 

  • Richardson N, Schaeffer H, Tran G (2022) SRMD: Sparse random mode decomposition. arXiv preprint arXiv:2204.06108

  • Rojas DP, Dean NE, Yang Y, Kenah E, Quintero J, Tomasi S, Ramirez EL, Kelly Y, Castro C, Carrasquilla G et al (2016) The epidemiology and transmissibility of Zika virus in Girardot and San Andres island, Colombia, September 2015 to January 2016. Eurosurveillance 21(28):30283

    Article  Google Scholar 

  • Rudi A, Rosasco L (2017) Generalization properties of learning with random features. In: NIPS, pp 3215–3225

  • Rudy SH, Brunton SL, Proctor JL, Kutz JN (2017) Data-driven discovery of partial differential equations. Sci Adv 3(4):1602614

    Article  Google Scholar 

  • Saha E, Schaeffer H, Tran G (2022) HARFE: Hard-ridge random feature expansion. arXiv preprint arXiv:2202.02877

  • Schaeffer H (2017) Learning partial differential equations via data discovery and sparse optimization. Proc R Soc A: Math, Phys Eng Sci 473(2197):20160446

    Article  MathSciNet  MATH  Google Scholar 

  • Schaeffer H, Tran G, Ward R (2018) Extracting sparse high-dimensional dynamics from limited data. SIAM J Appl Math 78(6):3279–3295

    Article  MathSciNet  MATH  Google Scholar 

  • Sitzmann V, Martel J, Bergman A, Lindell D, Wetzstein G (2020) Implicit neural representations with periodic activation functions. Adv Neural Inf Process Syst 33:7462–7473

    Google Scholar 

  • Smirnova A, deCamp L, Chowell G (2019) Forecasting epidemics through nonparametric estimation of time-dependent transmission rates using the SEIR model. Bull Math Biol 81:4343–4365

    Article  MathSciNet  MATH  Google Scholar 

  • Su W-H, Chou C-S, Xiu D (2021) Deep learning of biological models from data: applications to ODE models. Bull Math Biol 83(3):1–19

    Article  MathSciNet  MATH  Google Scholar 

  • Takens F (2006) Detecting strange attractors in turbulence. In: Dynamical Systems and Turbulence, Warwick 1980: Proceedings of a Symposium Held at the University of Warwick 1979/80, Springer, pp 366–381

  • Tran G, Ward R (2017) Exact recovery of chaotic systems from highly corrupted data. Multiscale Model Simul 15(3):1108–1129

    Article  MathSciNet  Google Scholar 

  • Uribarri G, Mindlin GB (2022) Dynamical time series embeddings in recurrent neural networks. Chaos, Solitons Fractals 154:111612

    Article  Google Scholar 

  • Vlachas PR, Byeon W, Wan ZY, Sapsis TP, Koumoutsakos P (2018) Data-driven forecasting of high-dimensional chaotic systems with long short-term memory networks. Proc R Soc A: Math, Phys Eng Sci 474(2213):20170844

    Article  MathSciNet  MATH  Google Scholar 

  • Weinan E (2017) A proposal on machine learning via dynamical systems. Commun Math Stat 1(5):1–11

    MathSciNet  MATH  Google Scholar 

  • Weinan E, Ma C, Wu L (2019) A comparative analysis of optimization and generalization properties of two-layer neural network and random feature models under gradient descent dynamics. Sci, China Math

  • Xie Y, Shi R, Schaeffer H, Ward R (2022) Shrimp: Sparser random feature models via iterative magnitude pruning. In: Mathematical and Scientific Machine Learning, pp 303–318. PMLR

  • Yang Z, Bai Y, Mei S (2021) Exact gap between generalization error and uniform convergence in random feature models. In: International Conference on Machine Learning, pp 11704–11715 . PMLR

  • Zou D, Wang L, Xu P, Chen J, Zhang W, Gu Q (2020) Epidemic model guided machine learning for COVID-19 forecasts in the United States. MedRxiv

Download references


LSTH was supported by the Canada Research Chairs program, the NSERC Discovery Grant RGPIN-2018-05447, and the NSERC Discovery Launch Supplement DGECR-2018-00181. E. Saha and G. Tran were supported by the NSERC Discovery Grant and the NSERC Discovery Launch Supplement.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Lam Si Tung Ho.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Results on Simulated Data with \(2\%\) Noise

Appendix A Results on Simulated Data with \(2\%\) Noise

See Fig. 13.

Fig. 13
figure 13

Results on simulated data with \(2\%\) noise added to input data. First row: Noisy training datasets from 0 upto 81st and 125th days out of 180 days (top left). The next two figures are the predicted values of the infectious variable I(t) in the next seven days using SPADE4 (blue), SEIR model (magenta), and S\(\mu \)EIR model (green) correspond to those two training datasets versus ground truth (black). Second row: Prediction of I(t) for next seven days around the peak of the wave using SPADE4 (blue), SEIR model (magenta), and S\(\mu \)EIR model (green) versus ground truth (black) (Color figure online)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saha, E., Ho, L.S.T. & Tran, G. SPADE4: Sparsity and Delay Embedding Based Forecasting of Epidemics. Bull Math Biol 85, 71 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: