Skip to main content
Log in

Artificial Neural Network Prediction of COVID-19 Daily Infection Count

  • Original Article
  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

This study addresses COVID-19 testing as a nonlinear sampling problem, aiming to uncover the dependence of the true infection count in the population on COVID-19 testing metrics such as testing volume and positivity rates. Employing an artificial neural network, we explore the relationship among daily confirmed case counts, testing data, population statistics, and the actual daily case count. The trained artificial neural network undergoes testing in in-sample, out-of-sample, and several hypothetical scenarios. A substantial focus of this paper lies in the estimation of the daily true case count, which serves as the output set of our training process. To achieve this, we implement a regularized backcasting technique that utilize death counts and the infection fatality ratio (IFR), as the death statistics and serological surveys (providing the IFR) as more reliable COVID-19 data sources. Addressing the impact of factors such as age distribution, vaccination, and emerging variants on the IFR time series is a pivotal aspect of our analysis. We expect our study to enhance our understanding of the genuine implications of the COVID-19 pandemic, subsequently benefiting mitigation strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data Availibility

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  • Organization WH (2023) WHO Coronavirus (COVID-19) Dashboard. https://covid19.who.int/ Accessed 2023-06-10

  • Harvey WT, Carabelli AM, Jackson B, Gupta RK, Thomson EC, Harrison EM, Ludden C, Reeve R, Rambaut A, Consortium C-GUC-U (2021) Sars-cov-2 variants, spike mutations and immune escape. Nature Reviews Microbiology 19(7), 409–424

  • Wu SL, Mertens AN, Crider YS, Nguyen A, Pokpongkiat NN, Djajadi S, Seth A, Hsiang MS, Colford JM Jr, Reingold A (2020) Substantial underestimation of sars-cov-2 infection in the united states. Nature communications 11(1):4507

    Article  Google Scholar 

  • Team C-F (2022) Variation in the covid-19 infection-fatality ratio by age, time, and geography during the pre-vaccine era: a systematic analysis. The Lancet 399(10334), 1469–1488 https://doi.org/10.1016/S0140-6736(21)02867-1

  • Brazeau NF, Verity R, Jenks S, Fu H, Whittaker C, Winskill P, Dorigatti I, Walker PG, Riley S, Schnekenberg RP (2022) Estimating the covid-19 infection fatality ratio accounting for seroreversion using statistical modelling. Communications medicine 2(1):54

    Article  Google Scholar 

  • Meyerowitz-Katz G, Merone L (2020) A systematic review and meta-analysis of published research data on covid-19 infection fatality rates. International Journal of Infectious Diseases 101:138–148

    Article  Google Scholar 

  • Barber RM, Sorensen RJ, Pigott DM, Bisignano C, Carter A, Amlag JO, Collins JK, Abbafati C, Adolph C, Allorant A (2022) Estimating global, regional, and national daily and cumulative infections with sars-cov-2 through nov 14, 2021: a statistical analysis. The Lancet 399(10344):2351–2380

    Article  Google Scholar 

  • Hortaçsu A, Liu J, Schwieg T (2021) Estimating the fraction of unreported infections in epidemics with a known epicenter: An application to covid-19. Journal of Econometrics 220(1):106–129

    Article  MathSciNet  Google Scholar 

  • Chen Z, Feng L, Lay HA Jr, Furati K, Khaliq A (2022) Seir model with unreported infected population and dynamic parameters for the spread of covid-19. Mathematics and computers in simulation 198:31–46

    Article  MathSciNet  Google Scholar 

  • Albani V, Loria J, Massad E, Zubelli J (2021) Covid-19 underreporting and its impact on vaccination strategies. BMC Infectious Diseases 21:1–13

    Article  Google Scholar 

  • Tang S, Cao Y (2023) A phenomenological neural network powered by the national wastewater surveillance system for estimation of silent covid-19 infections. Science of The Total Environment 902:166024

    Article  Google Scholar 

  • Guo Q, He Z (2021) Prediction of the confirmed cases and deaths of global covid-19 using artificial intelligence. Environmental Science and Pollution Research 28:11672–11682

    Article  Google Scholar 

  • Vaid S, Cakan C, Bhandari M (2020) Using machine learning to estimate unobserved covid-19 infections in north america. The Journal of bone and joint surgery. American volume

  • Dairi A, Harrou F, Zeroual A, Hittawe MM, Sun Y (2021) Comparative study of machine learning methods for covid-19 transmission forecasting. Journal of Biomedical Informatics 118:103791

    Article  Google Scholar 

  • Kamalov F, Rajab K, Cherukuri AK, Elnagar A, Safaraliev M (2022) Deep learning for covid-19 forecasting: State-of-the-art review. Neurocomputing 511:142–154

    Article  Google Scholar 

  • Rahimi I, Chen F, Gandomi AH (2023) A review on covid-19 forecasting models. Neural Computing and Applications 35(33):23671–23681

    Article  Google Scholar 

  • He S, Peng Y, Sun K (2020) Seir modeling of the covid-19 and its dynamics. Nonlinear dynamics 101:1667–1680

    Article  Google Scholar 

  • Perc M, Gorišek Miksić N, Slavinec M, Stožer A (2020) Forecasting covid-19. Frontiers in physics 8:127

    Article  Google Scholar 

  • Namasudra S Dhamodharavadhani S, Rathipriya R (2021) Nonlinear neural network based forecasting model for predicting covid-19 cases. Neural processing letters, 1–21

  • Dutta R, Das N, Majumder M, Jana B (2023) Aspect based sentiment analysis using multi-criteria decision-making and deep learning under covid-19 pandemic in india. CAAI Transactions on Intelligence Technology 8(1):219–234

    Article  Google Scholar 

  • Chimmula VKR, Zhang L (2020) Time series forecasting of covid-19 transmission in canada using lstm networks. Chaos, solitons & fractals 135:109864

    Article  Google Scholar 

  • Watson GL, Xiong D, Zhang L, Zoller JA, Shamshoian J, Sundin P, Bufford T, Rimoin AW, Suchard MA, Ramirez CM (2021) Pandemic velocity: Forecasting covid-19 in the us with a machine learning & bayesian time series compartmental model. PLoS computational biology 17(3):1008837

    Article  Google Scholar 

  • Kevrekidis GA, Rapti Z, Drossinos Y, Kevrekidis PG, Barmann MA, Chen Q-Y, Cuevas-Maraver J (2022) Backcasting covid-19: a physics-informed estimate for early case incidence. Royal Society Open Science 9(12):220329

    Article  Google Scholar 

  • Phipps SJ, Grafton RQ, Kompas T (2020) Robust estimates of the true (population) infection rate for covid-19: a backcasting approach. Royal Society Open Science 7(11):200909. https://doi.org/10.1098/rsos.200909

    Article  Google Scholar 

  • Miller AC, Hannah LA, Futoma J, Foti NJ, Fox EB, D’Amour A, Sandler M, Saurous RA, Lewnard JA (2022) Statistical deconvolution for inference of infection time series. Epidemiology (Cambridge, Mass.) 33(4), 470

  • Jahja M, Chin A, Tibshirani RJ (2022) Real-time estimation of covid-19 infections: Deconvolution and sensor fusion. Statistical Science 37(2):207–228

    Article  MathSciNet  Google Scholar 

  • Sarría-Santamera A, Abdukadyrov N, Glushkova N, Russell Peck D, Colet P, Yeskendir A, Asúnsolo A, Ortega MA (2022) Towards an accurate estimation of covid-19 cases in kazakhstan: Back-casting and capture-recapture approaches. Medicina 58(2):253

    Article  Google Scholar 

  • Irons NJ, Raftery AE (2021) Estimating sars-cov-2 infections from deaths, confirmed cases, tests, and random surveys. Proceedings of the National Academy of Sciences 118(31):2103272118. https://doi.org/10.1073/pnas.2103272118

    Article  Google Scholar 

  • Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 378:686–707

    Article  MathSciNet  Google Scholar 

  • Center JHCR (2023) COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. https://github.com/CSSEGISandData/COVID-19 Accessed 2023-06-10

  • Kidger P, Lyons T (2020) Universal approximation with deep narrow networks. In: Conference on Learning Theory, pp. 2306–2327. PMLR

  • Maiorov V, Pinkus A (1999) Lower bounds for approximation by mlp neural networks. Neurocomputing 25(1–3):81–91

    Article  Google Scholar 

  • Zhai J, Dobson M, Li Y (2022) A deep learning method for solving fokker-planck equations. In: Mathematical and Scientific Machine Learning, pp. 568–597. PMLR

  • Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  • Cleveland WS, Devlin SJ (1988) Locally weighted regression: an approach to regression analysis by local fitting. Journal of the American statistical association 83(403):596–610

    Article  Google Scholar 

  • Flaxman S, Mishra S, Gandy A, Unwin H, Coupland H, Mellan T, Zhu H, Berah T, Eaton J, Perez Guzman P, et al (2020) Report 13: Estimating the number of infections and the impact of non-pharmaceutical interventions on covid-19 in 11 european countries

  • Miller AC, Hannah L, Futoma J, Foti NJ, Fox EB, D’Amour A, Sandler M, Saurous RA, Lewnard JA (2022) Statistical deconvolution for inference of infection time series. Epidemiology 33(4):470–479. https://doi.org/10.1097/EDE.0000000000001495

    Article  Google Scholar 

  • Jahja M, Chin A, Tibshirani RJ (2022) Real-Time Estimation of COVID-19 Infections: Deconvolution and Sensor Fusion. Statistical Science 37(2):207–228. https://doi.org/10.1214/22-STS856

    Article  MathSciNet  Google Scholar 

  • Disease Control C (2023a) Prevention: COVID-19 Weekly Cases and Deaths per 100,000 Population by Age, Race/Ethnicity, and Sex. https://covid.cdc.gov/covid-data-tracker/#demographicsovertime Accessed 2023-06-10

  • Akima H (1970) A new method of interpolation and smooth curve fitting based on local procedures. Journal of the ACM (JACM) 17(4):589–602

    Article  Google Scholar 

  • Akima H (1974) A method of bivariate interpolation and smooth surface fitting based on local procedures. Communications of the ACM 17(1):18–20

    Article  Google Scholar 

  • Easton DM, Hirsch HR (2008) For prediction of elder survival by a gompertz model, number dead is preferable to number alive. Age 30:311–317

    Article  Google Scholar 

  • Disease Control C (2023b) Prevention: COVID-19 Vaccination Age and Sex Trends in the United States, National and Jurisdictional. https://data.cdc.gov/Vaccinations/COVID-19-Vaccination-Age-and-Sex-Trends-in-the-Uni/5i5k-6cmh Accessed 2023-06-10

  • Lewnard JA, Hong VX, Patel MM, Kahn R, Lipsitch M, Tartof SY (2022) Clinical outcomes associated with sars-cov-2 omicron (b. 1.1. 529) variant and ba. 1/ba. 1.1 or ba. 2 subvariant infection in southern california. Nature medicine 28(9), 1933–1943

  • Ulloa AC, Buchan SA, Daneman N, Brown KA (2022) Estimates of sars-cov-2 omicron variant severity in ontario, canada. Jama 327(13):1286–1288

    Article  Google Scholar 

  • Ward IL, Bermingham C, Ayoubkhani D, Gethings OJ, Pouwels KB, Yates T, Khunti K, Hippisley-Cox J, Banerjee A, Walker AS, et al (2022) Risk of covid-19 related deaths for sars-cov-2 omicron (b. 1.1. 529) compared with delta (b. 1.617. 2): retrospective cohort study. bmj 378

  • Nyberg T, Ferguson NM, Nash SG, Webster HH, Flaxman S, Andrews N, Hinsley W, Bernal JL, Kall M, Bhatt S (2022) Comparative analysis of the risks of hospitalisation and death associated with sars-cov-2 omicron (b. 1.1. 529) and delta (b. 1.617. 2) variants in england: a cohort study. The Lancet 399(10332), 1303–1312

  • Disease Control C (2023c) Prevention: COVID data tracker: Variant Proportion. https://covid.cdc.gov/covid-data-tracker/#variant-proportions Accessed 2023-06-10

  • Disease Control C (2023d) Prevention: Rates of COVID-19 Cases and Deaths by Vaccination Status. https://data.cdc.gov/Public-Health-Surveillance/Rates-of-COVID-19-Cases-or-Deaths-by-Age-Group-and/54ys-qyzm Accessed 2023-06-10

  • Scheiner S, Ukaj N, Hellmich C (2020) Mathematical modeling of covid-19 fatality trends: Death kinetics law versus infection-to-death delay rule. Chaos, Solitons & Fractals 136:109891

    Article  Google Scholar 

  • Feng Z, Xu D, Zhao H (2007) Epidemiological models with non-exponentially distributed disease stages and applications to disease control. Bulletin of mathematical biology 69(5):1511–1536

    Article  MathSciNet  Google Scholar 

  • Ghosh S, Volpert V, Banerjee M (2022) An epidemic model with time-distributed recovery and death rates. Bulletin of Mathematical Biology 84(8):78

    Article  MathSciNet  Google Scholar 

  • Shah S, Gwee SXW, Ng JQX, Lau N, Koh J, Pang J (2022) Wastewater surveillance to infer covid-19 transmission: A systematic review. Science of The Total Environment 804:150060

    Article  Google Scholar 

  • Daughton CG (2020) Wastewater surveillance for population-wide covid-19: The present and future. Science of the Total Environment 736:139631

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank REU students Ziyan Zhao for collecting vaccination data and Jessica Hu for collecting age group case data.

Author information

Authors and Affiliations

Authors

Contributions

Yao Li and Ning Jiang are partially supported by NSF DMS-1813246 and DMS-2108628. Charles Kolozsvary is partially supported by the REU part of NSF DMS-1813246 and NSF DMS-2108628.

Corresponding author

Correspondence to Yao Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Additional Data About COVID-19

Appendix A: Additional Data About COVID-19

In this section we present many figures that demonstrate raw data, processed data, and intermediate results used to generate the training set. Some data for selected states have been already demonstrated in the main text. This includes

  1. 1

    Time series of IFR for all 50 states plus Washington DC

  2. 2

    Time series of recovered true cases and undercounting factor for all 50 states plus Washington DC

  3. 3

    Raw and smoothed confirmed daily case count and daily death count for all 50 states plus Washington DC

  4. 4

    Time series of case rate per age group at all regions of the United States

  5. 5

    Time series of vaccination rate of all age group for all 50 states plus Washington DC

  6. 6

    Incident rate ratio of COVID-19 case and death for vaccinated and unvaccinated groups.

  7. 7

    Time series of testing volume for all 50 states plus Washington DC

1.1 A.1 Time Series of State IFR

The time series of IFR for 10 selected states are presented in the main text. Below we demonstrate the time series of IFR for all 50 states plus Washington DC after considering age group case rate, vaccination, variant in Figs. 13 and 14.

Fig. 13
figure 13

Time series of state IFR for 24 states plus Washington DC (Color figure online)

Fig. 14
figure 14

Time series of state IFR for 26 states (Color figure online)

1.2 A.2 Time Series of State Recovered True Case

The time series of recovered true case and under counting factor for 10 selected states are demonstrated in the main text. Here we show these data for all 50 states plus Washington DC in Figs. 15 and 16.

Fig. 15
figure 15

Time series of recovered true case count for 24 states plus Washington DC (Color figure online)

Fig. 16
figure 16

Time series of recovered true case count for 26 states and Washington DC (Color figure online)

1.3 A.3 State Confirmed Case and Death

Figures 17 and 18 show the daily case count and \(100 \times \) daily death count of all 50 states plus Washington DC. The data comes from the JHU COVID-19 database (Center 2023). Figure 19 and 20 are the processed daily case count and daily death count after addressing data dump and holiday issues.

Fig. 17
figure 17

Daily confirmed case count and \(100 \times \) daily death count for 24 states plus Washington DC. Raw data before processing (Color figure online)

Fig. 18
figure 18

Daily confirmed case count and \(100 \times \) daily death count for 26 states. Raw data before processing (Color figure online)

Fig. 19
figure 19

Daily confirmed case count and \(100 \times \) daily death count for 24 states plus Washington DC. Processed data after addressing weekday issue, holiday issue, and artificial data dump from backlogs (Color figure online)

Fig. 20
figure 20

Daily confirmed case count and \(100 \times \) daily death count for 26 states. Processed data after addressing weekday issue, holiday issue, and artificial data dump from backlogs (Color figure online)

1.4 A.4 Case Rate Per Age Group

Figure 21 shows the time series of case rate of each age group from all 10 regions provided by CDC (Disease Control 2023a). The HHS regions used by CDC is described in the following Table 1.

Table 1 List of states and districts in each CDC region
Fig. 21
figure 21

Case rate of each age group in all 10 regions (Color figure online)

1.5 A.5 State Vaccination Rate

Figures 22 and 23 gives the time series of vaccinate rate for each age group older than 18 years old in all 50 states plus Washington DC. This data is obtained from CDC (Disease Control 2023b).

Fig. 22
figure 22

Time series of vaccination rate of each age group for 24 states plus Washington DC (Color figure online)

Fig. 23
figure 23

Time series of vaccination rate of each age group for 26 states (Color figure online)

1.6 A.6 Incident Rate Ratio (IRR) of Vaccinated and Unvaccinated Groups

The incident ratio of COVID-19 infection and death for each group is given in Fig. 24. This data is obtained from CDC website (Disease Control 2023d). Note that death data of younger age group is not included because there are too few, sometimes zero, death count from vaccinated young group in many weeks. The ratio of IFR of unvaccinated group to vaccinated group of three older age groups are shown in Fig. 24 Right.

Fig. 24
figure 24

Left and Middle: Incident rate ratio (IRR) of COVID-19 infection and death for each age group. Right: Ratio of IFR of unvaccinated group to vaccinated group (Color figure online)

1.7 A.7 State Testing Volume

Figures 25 and 26 gives the time series of smoothed COVID-19 test volume in all 50 states plus Washington DC. This data comes from the Coronavirus Resource Center of Johns Hopskins University (Center 2023).

Fig. 25
figure 25

Time series of COVID-19 testing volume for 24 states plus Washington DC. The blue and red lines are raw data and smoothed data respectively (Color figure online)

Fig. 26
figure 26

Time series of COVID-19 testing volume for 26 states. The blue and red lines are raw data and smoothed data respectively (Color figure online)

1.8 A.8 Training Set Data Distribution

Figure 27 displays the distribution of the data in the training data set.

Fig. 27
figure 27

Distribution of normalized data (Color figure online)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, N., Kolozsvary, C. & Li, Y. Artificial Neural Network Prediction of COVID-19 Daily Infection Count. Bull Math Biol 86, 49 (2024). https://doi.org/10.1007/s11538-024-01275-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11538-024-01275-3

Keywords

Mathematics Subject Classification

Navigation