SPADE4: Sparsity and Delay Embedding Based Forecasting of Epidemics

Saha, Esha; Ho, Lam Si Tung; Tran, Giang

doi:10.1007/s11538-023-01174-z

SPADE4: Sparsity and Delay Embedding Based Forecasting of Epidemics

Original Article
Published: 19 June 2023

Volume 85, article number 71, (2023)
Cite this article

Bulletin of Mathematical Biology Aims and scope Submit manuscript

398 Accesses
87 Altmetric
11 Mentions
Explore all metrics

Abstract

Predicting the evolution of diseases is challenging, especially when the data availability is scarce and incomplete. The most popular tools for modelling and predicting infectious disease epidemics are compartmental models. They stratify the population into compartments according to health status and model the dynamics of these compartments using dynamical systems. However, these predefined systems may not capture the true dynamics of the epidemic due to the complexity of the disease transmission and human interactions. In order to overcome this drawback, we propose Sparsity and Delay Embedding based Forecasting (SPADE4) for predicting epidemics. SPADE4 predicts the future trajectory of an observable variable without the knowledge of the other variables or the underlying system. We use random features model with sparse regression to handle the data scarcity issue and employ Takens’ delay embedding theorem to capture the nature of the underlying system from the observed variable. We show that our approach outperforms compartmental models when applied to both simulated and real data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting partially observed processes on temporal networks by Dynamics-Aware Node Embeddings (DyANE)

Article Open access 01 May 2021

On the predictability of infectious disease outbreaks

Article Open access 22 February 2019

Modeling and Predicting Human Infectious Diseases

Data Availability

The code is available at https://github.com/esha-saha/spade4 and data sources are COVID-19 in Canada: https://health-infobase.canada.ca/covid-19/epidemiological-summary-covid-19-cases.html Ebola in Guinea: https://www.kaggle.com/datasets/imdevskp/ebola-outbreak-20142016-complete-dataset Zika virus in Giradot: Rojas et al. (2016) Influenza A/H7N9 in China: https://datadryad.org/stash/dataset/doi:10.5061/dryad.2g43n

Notes

References

Althaus CL (2014) Estimating the reproduction number of Ebola virus (EBOV) during the 2014 outbreak in West Africa. PLoS currents 6
Ayed I, de Bézenac E, Pajot A, Brajard J, Gallinari P (2019) Learning dynamical systems from partial observations. arXiv preprint arXiv:1902.11136
Bhattacharya K, Hosseini B, Kovachki N.B, Stuart A.M (2020) Model reduction and neural networks for parametric PDEs. arXiv preprint arXiv:2005.03180
Blum MG, Tran VC (2010) HIV with contact tracing: a case study in approximate Bayesian computation. Biostatistics 11(4):644–660
Broomhead DS, King GP (1986) Extracting qualitative dynamics from experimental data. Physica D 20(2–3):217–236
Brunton SL, Proctor JL, Kutz JN (2016) Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc Natl Acad Sci 113(15):3932–3937
Article MathSciNet MATH Google Scholar
Brunton SL, Brunton BW, Proctor JL, Kaiser E, Kutz JN (2017) Chaos as an intermittently forced linear system. Nat Commun 8(1):1–9
Article Google Scholar
Cauchemez S, Ferguson NM (2008) Likelihood-based estimation of continuous-time epidemic models from time-series data: application to measles transmission in London. J R Soc Interface 5(25):885–897
Article Google Scholar
Champion KP, Brunton SL, Kutz JN (2019) Discovery of nonlinear multiscale systems: Sampling strategies and embeddings. SIAM J Appl Dyn Syst 18(1):312–333
Article MathSciNet MATH Google Scholar
Cramer EY, Ray EL, Lopez VK, Bracher J, Brennen A, Castro Rivadeneira AJ, Gerding A, Gneiting T, House KH, Huang Y et al (2022) Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the United States. Proc Natl Acad Sci 119(15):2113561119
Article Google Scholar
Dukic V, Lopes HF, Polson NG (2012) Tracking epidemics with Google flu trends data and a state-space SEIR model. J Am Stat Assoc 107(500):1410–1426
Article MathSciNet MATH Google Scholar
Girardi P, Gaetan C (2023) An SEIR model with time-varying coefficients for analyzing the SARS-CoV-2 Epidemic. Risk Anal 43(1):144–155
Article Google Scholar
González-García R, Rico-Martìnez R, Kevrekidis IG (1998) Identification of distributed parameter systems: A neural net based approach. Comput Chem Eng 22:965–968
Article Google Scholar
Goyal P, Benner P (2022) Discovery of nonlinear dynamical systems using a Runge-Kutta inspired dictionary-based sparse regression approach. Proc R Soc A 478(2262):20210883
Article MathSciNet Google Scholar
Hashemi A, Schaeffer H, Shi R, Topcu U, Tran G, Ward R (2023) Generalization bounds for sparse random feature expansions. Appl Comput Harmon Anal 62:310–330
Article MathSciNet MATH Google Scholar
Ho LST, Xu J, Crawford FW, Minin VN, Suchard MA (2018) Birth/birth-death processes and their computable transition probabilities with biological applications. J Math Biol 76(4):911–944
Article MathSciNet MATH Google Scholar
Ho LST, Crawford FW, Suchard MA (2018) Direct likelihood-based inference for discretely observed stochastic compartmental models of infectious disease. Ann Appl Stat 12(3):1993–2021
Article MathSciNet MATH Google Scholar
Huke J (2006) Embedding nonlinear dynamical systems: A guide to Takens’ theorem
Jacot A, Simsek B, Spadaro F, Hongler C, Gabriel F (2020) Implicit regularization of random feature models. In: International Conference on Machine Learning, pp 4631–4640 . PMLR
Juang J-N, Pappa RS (1985) An eigensystem realization algorithm for modal parameter identification and model reduction. J Guid Control Dyn 8(5):620–627
Article MATH Google Scholar
Kaiser E, Kutz JN, Brunton SL (2018) Sparse identification of nonlinear dynamics for model predictive control in the low-data limit. Proc R Soc A 474(2219):20180335
Article MathSciNet MATH Google Scholar
Kamb M, Kaiser E, Brunton SL, Kutz JN (2020) Time-delay observables for Koopman: Theory and applications. SIAM J Appl Dyn Syst 19(2):886–917
Article MathSciNet MATH Google Scholar
Lagaris IE, Likas A, Fotiadis DI (1998) Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans Neural Netw 9(5):987–1000
Article Google Scholar
Le Clainche S, Vega JM (2017) Higher order dynamic mode decomposition. SIAM J Appl Dyn Syst 16(2):882–925
Article MathSciNet MATH Google Scholar
Li Z, Kovachki N, Azizzadenesheli K, Liu B, Bhattacharya K, Stuart A, Anandkumar A (2020) Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895
Lin A.T, Eckhardt D, Martin R, Osher S, Wong A.S (2022) Parameter inference of time series by delay embeddings and learning differentiable operators. arXiv preprint arXiv:2203.06269
Lu L, Jin P, Karniadakis G.E (2019) Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193
Lu L, Meng X, Mao Z, Karniadakis GE (2021) Deepxde: A deep learning library for solving differential equations. SIAM Rev 63(1):208–228
Article MathSciNet MATH Google Scholar
Lusch B, Kutz JN, Brunton SL (2018) Deep learning for universal linear embeddings of nonlinear dynamics. Nat Commun 9(1):1–10
Article Google Scholar
Mangan NM, Kutz JN, Brunton SL, Proctor JL (2017) Model selection for dynamical systems via sparse regression and information criteria. Proc R Soc A: Math, Phys Eng Sci 473(2204):20170009
Article MathSciNet MATH Google Scholar
Narendra KS, Parthasarathy K (1992) Neural networks and dynamical systems. Int J Approx Reason 6(2):109–131
Article MATH Google Scholar
Nelsen NH, Stuart AM (2021) The random feature model for input-output maps between Banach spaces. SIAM J Sci Comput 43(5):3212–3243
Article MathSciNet MATH Google Scholar
Qin T, Wu K, Xiu D (2019) Data driven governing equations approximation using deep neural networks. J Comput Phys 395:620–635
Article MathSciNet MATH Google Scholar
Rahimi A, Recht B (2007) Random features for large-scale kernel machines. In: NIPS, vol. 3, p. 5 . Citeseer
Rahimi A, Recht B (2008a) Uniform approximation of functions with random bases. In: 2008 46th Annual Allerton Conference on Communication, Control, and Computing, IEEE, pp 555–561
Rahimi A, Recht B (2008b) Weighted sums of random kitchen sinks: replacing minimization with randomization in learning. In: NIPS, pp 1313–1320 . Citeseer
Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys 378:686–707
Article MathSciNet MATH Google Scholar
Richardson N, Schaeffer H, Tran G (2022) SRMD: Sparse random mode decomposition. arXiv preprint arXiv:2204.06108
Rojas DP, Dean NE, Yang Y, Kenah E, Quintero J, Tomasi S, Ramirez EL, Kelly Y, Castro C, Carrasquilla G et al (2016) The epidemiology and transmissibility of Zika virus in Girardot and San Andres island, Colombia, September 2015 to January 2016. Eurosurveillance 21(28):30283
Article Google Scholar
Rudi A, Rosasco L (2017) Generalization properties of learning with random features. In: NIPS, pp 3215–3225
Rudy SH, Brunton SL, Proctor JL, Kutz JN (2017) Data-driven discovery of partial differential equations. Sci Adv 3(4):1602614
Article Google Scholar
Saha E, Schaeffer H, Tran G (2022) HARFE: Hard-ridge random feature expansion. arXiv preprint arXiv:2202.02877
Schaeffer H (2017) Learning partial differential equations via data discovery and sparse optimization. Proc R Soc A: Math, Phys Eng Sci 473(2197):20160446
Article MathSciNet MATH Google Scholar
Schaeffer H, Tran G, Ward R (2018) Extracting sparse high-dimensional dynamics from limited data. SIAM J Appl Math 78(6):3279–3295
Article MathSciNet MATH Google Scholar
Sitzmann V, Martel J, Bergman A, Lindell D, Wetzstein G (2020) Implicit neural representations with periodic activation functions. Adv Neural Inf Process Syst 33:7462–7473
Google Scholar
Smirnova A, deCamp L, Chowell G (2019) Forecasting epidemics through nonparametric estimation of time-dependent transmission rates using the SEIR model. Bull Math Biol 81:4343–4365
Article MathSciNet MATH Google Scholar
Su W-H, Chou C-S, Xiu D (2021) Deep learning of biological models from data: applications to ODE models. Bull Math Biol 83(3):1–19
Article MathSciNet MATH Google Scholar
Takens F (2006) Detecting strange attractors in turbulence. In: Dynamical Systems and Turbulence, Warwick 1980: Proceedings of a Symposium Held at the University of Warwick 1979/80, Springer, pp 366–381
Tran G, Ward R (2017) Exact recovery of chaotic systems from highly corrupted data. Multiscale Model Simul 15(3):1108–1129
Article MathSciNet Google Scholar
Uribarri G, Mindlin GB (2022) Dynamical time series embeddings in recurrent neural networks. Chaos, Solitons Fractals 154:111612
Article Google Scholar
Vlachas PR, Byeon W, Wan ZY, Sapsis TP, Koumoutsakos P (2018) Data-driven forecasting of high-dimensional chaotic systems with long short-term memory networks. Proc R Soc A: Math, Phys Eng Sci 474(2213):20170844
Article MathSciNet MATH Google Scholar
Weinan E (2017) A proposal on machine learning via dynamical systems. Commun Math Stat 1(5):1–11
MathSciNet MATH Google Scholar
Weinan E, Ma C, Wu L (2019) A comparative analysis of optimization and generalization properties of two-layer neural network and random feature models under gradient descent dynamics. Sci, China Math
Xie Y, Shi R, Schaeffer H, Ward R (2022) Shrimp: Sparser random feature models via iterative magnitude pruning. In: Mathematical and Scientific Machine Learning, pp 303–318. PMLR
Yang Z, Bai Y, Mei S (2021) Exact gap between generalization error and uniform convergence in random feature models. In: International Conference on Machine Learning, pp 11704–11715 . PMLR
Zou D, Wang L, Xu P, Chen J, Zhang W, Gu Q (2020) Epidemic model guided machine learning for COVID-19 forecasts in the United States. MedRxiv

Download references

Acknowledgements

LSTH was supported by the Canada Research Chairs program, the NSERC Discovery Grant RGPIN-2018-05447, and the NSERC Discovery Launch Supplement DGECR-2018-00181. E. Saha and G. Tran were supported by the NSERC Discovery Grant and the NSERC Discovery Launch Supplement.

Author information

Authors and Affiliations

Department of Applied Mathematics, University of Waterloo, Waterloo, Canada
Esha Saha & Giang Tran
Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada
Lam Si Tung Ho

Authors

Esha Saha
View author publications
You can also search for this author in PubMed Google Scholar
Lam Si Tung Ho
View author publications
You can also search for this author in PubMed Google Scholar
Giang Tran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lam Si Tung Ho.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Results on Simulated Data with \(2\%\) Noise

See Fig. 13.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Saha, E., Ho, L.S.T. & Tran, G. SPADE4: Sparsity and Delay Embedding Based Forecasting of Epidemics. Bull Math Biol 85, 71 (2023). https://doi.org/10.1007/s11538-023-01174-z

Download citation

Received: 23 January 2023
Accepted: 27 May 2023
Published: 19 June 2023
DOI: https://doi.org/10.1007/s11538-023-01174-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SPADE4: Sparsity and Delay Embedding Based Forecasting of Epidemics

Abstract

Access this article

Similar content being viewed by others

Predicting partially observed processes on temporal networks by Dynamics-Aware Node Embeddings (DyANE)

On the predictability of infectious disease outbreaks

Modeling and Predicting Human Infectious Diseases

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A Results on Simulated Data with \(2\%\) Noise

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SPADE4: Sparsity and Delay Embedding Based Forecasting of Epidemics

Abstract

Access this article

Similar content being viewed by others

Predicting partially observed processes on temporal networks by Dynamics-Aware Node Embeddings (DyANE)

On the predictability of infectious disease outbreaks

Modeling and Predicting Human Infectious Diseases

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A Results on Simulated Data with \(2\%\) Noise

Appendix A Results on Simulated Data with \(2\%\) Noise

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation