System Inference Via Field Inversion for the Spatio-Temporal Progression of Infectious Diseases: Studies of COVID-19 in Michigan and Mexico

Wang, Zhenlin; Carrasco-Teja, Mariana; Zhang, Xiaoxuan; Teichert, Gregory H.; Garikipati, Krishna

doi:10.1007/s11831-021-09643-1

System Inference Via Field Inversion for the Spatio-Temporal Progression of Infectious Diseases: Studies of COVID-19 in Michigan and Mexico

Published: 01 October 2021

Volume 28, pages 4283–4295, (2021)
Cite this article

Download PDF

Archives of Computational Methods in Engineering Aims and scope Submit manuscript

System Inference Via Field Inversion for the Spatio-Temporal Progression of Infectious Diseases: Studies of COVID-19 in Michigan and Mexico

Download PDF

Zhenlin Wang¹,
Mariana Carrasco-Teja²,
Xiaoxuan Zhang¹,
Gregory H. Teichert¹ &
…
Krishna Garikipati ORCID: orcid.org/0000-0001-6697-0067^1,2,3

998 Accesses
2 Citations
Explore all metrics

Abstract

We present an approach to studying and predicting the spatio-temporal progression of infectious diseases. We treat the problem by adopting a partial differential equation (PDE) version of the Susceptible, Infected, Recovered, Deceased (SIRD) compartmental model of epidemiology, which is achieved by replacing compartmental populations by their densities. Building on our recent work (Computat Mech 66:1177, 2020), we replace our earlier use of global polynomial basis functions with those having local support, as epitomized in the finite element method, for the spatial representation of the SIRD parameters. The time dependence is treated by inferring constant parameters over time intervals that coincide with the time step in semi-discrete numerical implementations. In combination, this amounts to a scheme of field inversion of the SIRD parameters over each time step. Applied to data over ten months of 2020 for the pandemic in the US state of Michigan and to all of Mexico, our system inference via field inversion infers spatio-temporally varying PDE SIRD parameters that replicate the progression of the pandemic with high accuracy. It also produces accurate predictions, when compared against data, for a three week period into 2021. Of note is the insight that is suggested on the spatio-temporal variation of infection, recovery and death rates, as well as patterns of the population’s mobility revealed by diffusivities of the compartments.

Spatio-temporal predictive modeling framework for infectious disease spread

Article Open access 24 March 2021

Inference on the dynamics of COVID-19 in the United States

Article Open access 10 February 2022

Assessing the Spatio-temporal Spread of COVID-19 via Compartmental Models with Diffusion in Italy, USA, and Brazil

Article Open access 27 July 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Classical mathematical treatments of epidemiology, such as the Susceptible-Infected-Recovered (SIRD) model [1], are ordinary differential equations (ODEs) defined by specifying the compartmental sub-population numbers over some geographical region. Spatial effects have typically been introduced by resolving smaller regions and treating them individually [2,3,4,5,6]. During long-lasting and widespread epidemics, such as the COVID-19 Pandemic, the effects on the infection rate of imposing–and then lifting–mobility restrictions and social distancing mandates revolve on the question of the time and spatially varying mobility of the population. At the finest resolution, this must be approached via agent-based models [7], using individuals’ mobility data. However, this data is not available for the entire population, and contact tracing campaigns face challenges of recruiting workers, access, technology, as well as socio-political resistance. Against these difficulties, an intriguing question to explore is whether simple reaction-diffusion models can detect the evidence of mobility in these data. Such an approach must start with a partial differential equation (PDE) version of the epidemiological models, which is easily defined by converting compartmental sub-populations to densities over sub regions by normalizing with the corresponding areas. To address the mobility of the population, diffusion terms are introduced to the SIRD model, which is transformed to a set of reaction-diffusion PDEs in two spatial dimensions [8,9,10,11].

The widespread availability of data in the public domain [12,13,14,15,16,17,18,19] has spurred significant interest among computational and data scientists, who have sought to test and refine their methods against these repositories. This has opened up the possibility that advances in computational and data science may contribute to the existing and rapidly expanding body of work in epidemiology, in inferring the dynamics of COVID-19 and making projections. We have similarly sought to build off our recent work in data-driven and machine learning approaches [20,21,22,23,24,25,26,27,28,29] and presented a class of system identification techniques for inference of ODE and PDE forms of the SIRD model, as well as Bayesian neural networks for representation and uncertainty quantification-guided prediction [11]. That work focused on the US state of Michigan. In this communication, we revise our approach for inference of the PDE SIRD model with temporally and spatially evolving parameters and diffusivities. Importantly, instead of global polynomial representations of PDE SIRD parameters over the spatial and temporal domains, we adopt field inversion over time intervals that coincide with the time steps of our underlying numerical implementation. This affords much greater accuracy over the global polynomial ansatz. Adjoint-based gradient optimization for field inversion of parameters at each time step replaces the use of stepwise regression-based system identification in our previous work. We find that the improved accuracy with respect to the data over the time interval of inference, as well as of the predictions, is worth the increased expense. We have brought abundant high-quality, public domain, data [12,13,14,15,16,17,18,19] on the evolution of COVID-19 in both, the State of Michigan, with a population of 9.98 million, distributed in 83 counties, over 250,493 km$^2$, and the country of Mexico, with a population of 126 million, distributed in 32 geographical entities (31 states and Mexico City), on 1,972,550 km$^2$. The temporal resolution by days and spatial resolution by counties/states have allowed us to study the mobility in these data using our methods of system inference.

In Sect. 2 we review our previous work of system inference for the spatio-temporal SIRD model first, and then extend it by incorporating temporal and spatial parameters and diffusivities using a finite element representation. Our PDE SIRD model-constrained inference approach is presented in Sect. 3. Section 4 is on data preparation. The results for inference of classical SIRD parameters as well as the diffusivities, and forward prediction are presented in Sect. 5. Our conclusions appear in Sect. 6.

2 Compartmental Differential Equations Models of Infectious Disease Dynamics

We begin with the conventional SIRD compartmental epidemiology model. The population, taken to remain constant at N, is divided into four disjoint compartments with time-dependent sub-populations: S(t) for susceptible, I(t) for infected, R(t) for recovered and D(t) for deceased individuals. The governing system of ordinary differential equations (ODEs) is:

$$\begin{aligned} \frac{\text {d} S}{\text {d} t}&=-\frac{\beta }{N} SI+\gamma R \end{aligned}$$

(1)

$$\begin{aligned} \frac{\text {d}I}{\text {d}t}&=\frac{\beta }{N}SI-\mu I-\alpha I\end{aligned}$$

(2)

$$\begin{aligned} \frac{\text {d}R}{\text {d}t}&=\mu I-\gamma R\end{aligned}$$

(3)

$$\begin{aligned} \frac{\text {d}D}{\text {d}t}&=\alpha I\end{aligned}$$

(4)

$$\begin{aligned} N&= S(t) + I(t) + R(t) + D(t). \end{aligned}$$

(5)

This is the canonical form of the model where the sub-populations are assumed to be well-mixed so that spatial variations can be ignored over the domain of interest. Here $\beta$ is the infection rate, $\mu$ is the recovery rate, $\gamma$ is the rate of immunity loss, and $\alpha$ is the death rate.

We have extended the SIRD model to a system of partial differential equations (PDEs) in two spatial dimensions using the same compartments [11]. However, the population variables are now replaced with spatio-temporally varying densities, ${\widehat{S}}({\varvec{x}},t),{\widehat{I}}({\varvec{x}},t),{\widehat{R}}({\varvec{x}},t),{\widehat{D}}({\varvec{x}},t)$ defined as numbers per unit area.

$$\begin{aligned} \frac{\partial {\widehat{S}}}{\partial t}&={\mathcal {D}}_\text {S}\nabla ^2 {\widehat{S}}-\frac{\beta }{{\widehat{N}}} {\widehat{S}}{\widehat{I}}+\gamma {\widehat{R}} \end{aligned}$$

(6)

$$\begin{aligned} \frac{\partial {\widehat{I}}}{\partial t}&={\mathcal {D}}_\text {I}\nabla ^2 {\widehat{I}}+\frac{\beta }{{\widehat{N}}}{\widehat{S}}{\widehat{I}}-\mu {\widehat{I}}-\alpha {\widehat{I}} \end{aligned}$$

(7)

$$\begin{aligned} \frac{\partial {\widehat{R}}}{\partial t}&={\mathcal {D}}_\text {R}\nabla ^2 {\widehat{R}}+\mu {\widehat{I}}-\gamma {\widehat{R}} \end{aligned}$$

(8)

$$\begin{aligned} \frac{\partial {\widehat{D}}}{\partial t}&=\alpha {\widehat{I}} \end{aligned}$$

(9)

where ${\mathcal {D}}_\text {S}, {\mathcal {D}}_\text {I}, {\mathcal {D}}_\text {R}$ are diffusivities of the corresponding compartments, and represent the mobility of the sub-population via random walks. We define $\widehat{(\bullet )}=(\bullet )/\int _{\Omega }\text {d}A$ where $\Omega$ is the domain of study: either the lower peninsula of the State of Michigan, or the territory of the country of Mexico. Furthermore the population constraint holds: $\int _{\Omega }{\widehat{N}}\text {d}A = \int _{\Omega }{\widehat{S}}(t)\text {d}A + \int _{\Omega }{\widehat{I}}(t)\text {d}A + \int _{\Omega }{\widehat{R}}(t)\text {d}A + \int _{\Omega }{\widehat{D}}(t)\text {d}A$. In what follows of this communication, we only consider the PDE SIRD model, and, for the sake of readability, we dispense with the hats on the compartments.

We adopt the weak form, and specifically, the finite element framework for the above system of PDEs. For a generic, finite-dimensional field $u^h$, the problem is stated as follows: find $u^h\in {\mathscr {S}}^h \subset {\mathscr {S}}$, where ${\mathscr {S}}^h= \{ u^h \in {\mathscr {H}}^1(\Omega ) ~\vert ~u^h = ~{\bar{u}}\; \mathrm {on}\; \Gamma ^u\}$, such that $\forall ~w^h \in {\mathscr {V}}^h \subset {\mathscr {V}}$, where ${\mathscr {V}}^h= \{ w^h \in {\mathscr {H}}^1(\Omega )~\vert ~w^h = ~0 \;\mathrm {on}\; \Gamma ^u\}$, the finite-dimensional (Galerkin) weak form of the problem is satisfied. The variations $w^h$ and trial solutions $u^h$ are defined component-wise using a finite number of basis functions,

$$\begin{aligned} w^h = \sum _{a=1}^{n_\mathrm {b}} c^a N^a, \quad \qquad u^h = \sum _{a=1}^{n_\mathrm {b}} d^a N^a, \end{aligned}$$

(10)

where $n_\mathrm {b}$ is the dimensionality of the function spaces ${\mathscr {S}}^h$ and ${\mathscr {V}}^h$, and $N^a$ represents the basis functions. To obtain the Galerkin weak forms, we multiply each equation in strong form in (6-9) by a weighting function $w_\text {S}^h,w_\text {I}^h,w_\text {R}^h,w_\text {D}^h$, respectively, integrate by parts, apply boundary conditions appropriately, and use the Backward Euler method for time-discretization with $(\bullet )_n$ denoting a discretized quantity at time $t_n$ and $\Delta t$ being the time step. See Ref. [11] for details. This leads to:

$$\begin{aligned} \int _{\Omega }w^h_\text {S} \frac{S^h_{n} - S^h_{n-1}}{\Delta t} \text {d}s =&-\int _{\Omega } {\mathcal {D}}_\text {S}\nabla w^h_\text {S}\cdot \nabla S^h_n\text {d}s\nonumber \\&-\int _{\Omega }w^h_\text {S}\left( \frac{\beta }{N} S^h_nI^h_n+\gamma R^h_n \right) \text {d}s \end{aligned}$$

(11)

$$\begin{aligned} \int _{\Omega }w^h_\text {I}\frac{I^h_n - I^h_{n-1}}{\Delta t}\text {d}s =&-\int _{\Omega }{\mathcal {D}}_\text {I}\nabla w^h_\text {I}\cdot \nabla I^h_n\text {d}s\nonumber \\&+\int _{\Omega }w^h_\text {I}\left( \frac{\beta }{N}S^h_nI^h_n-\mu I^h_n-\alpha I^h_n \right) \text {d}s \end{aligned}$$

(12)

$$\begin{aligned} \int _{\Omega }w^h_\text {R}\frac{R^h_n - R^h_{n-1}}{\Delta t}\text {d}s =&-\int _{\Omega }{\mathcal {D}}_\text {R}\nabla w^h_\text {R}\cdot \nabla R^h_n\text {d}s \nonumber \\&+\int _{\Omega }w^h_\text {R}\left( \mu I^h_n-\gamma R^h_n\right) \text {d}s \end{aligned}$$

(13)

$$\begin{aligned} \int _{\Omega }w^h_\text {D} \frac{D_n - D_{n-1}}{\Delta t}\text {d}s =&\int _{\Omega }w^h_\text {D}\alpha I^h_n\text {d}s \end{aligned}$$

(14)

where the boundary terms vanish because we assume that the sub-populations do not leave the region, thus enforcing zero flux boundary conditions.

In our previous work [11], we have characterized the coefficients to vary via a global-in-time polynomial basis. While the inferred model reproduced the trends, there was a notable error over time of the statewide sub-population estimates $S(t,\varvec{x}), I(t,\varvec{x}), R(t,\varvec{x}), D(t,\varvec{x})$ obtained by forward simulation with inferred quantities (See Figs. 14 and 15 in [11]).

Additionally, the highly complex geometry of the State of Michigan, and of Mexico (See maps in Fig. 1), and potentially highly nonuniform distributions of the coefficients in space makes it challenging to characterize them with simple basis functions. Global polynomials in space could not sufficiently resolve the emergence and disappearance of “hot spots" and “cold spots" [11]. In this communication, we allow the coefficients $\beta ,\gamma ,\mu ,\alpha , {\mathcal {D}}_\text {S},\dots ,{\mathcal {D}}_\text {R}$ of the PDE SIRD model to vary over space via finite-dimensional, locally supported representations as we do for the primary variables $S(t,\varvec{x}), I(t,\varvec{x}), R(t,\varvec{x}), D(t,\varvec{x})$. Further more, we allow the coefficients to vary daily, leading to:

$$\begin{aligned} \beta ^h_n =&\sum _{a=1}^{n_p} \beta ^a_n N^a, \qquad \gamma ^h_n = \sum _{a=1}^{n_p} \gamma ^a_n N^a, \nonumber \\ \mu ^h_n =&\sum _{a=1}^{n_p} \mu ^a_n N^a \qquad \alpha ^h_n = \sum _{a=1}^{n_p} \alpha ^a_n N^a \end{aligned}$$

(15)

$$\begin{aligned} {\mathcal {D}}_{\text {S}_n}^h =&\sum _{a=1}^{n_p} {\mathcal {D}}_{\text {S}_n}^a N^a,\qquad {\mathcal {D}}_{\text {I}_n}^h = \sum _{a=1}^{n_p} {\mathcal {D}}_{\text {I}_n}^a N^a,\nonumber \\ {\mathcal {D}}_{\text {R}_n}^h =&\sum _{a=1}^{n_p} {\mathcal {D}}_{\text {R}_n}^a N^a \end{aligned}$$

(16)

where, as for the primary variables, the subscripts $(\bullet )n$ denote the coefficients on day n. With this, the PDE SIRD equations become:

$$\begin{aligned} \int _{\Omega }w^h_\text {S} \frac{S^h_{n} - S^h_{n-1}}{\Delta t} \text {d}s =&-\int _{\Omega } {\mathcal {D}}^h_{\text {S}_n}\nabla w^h_\text {S}\cdot \nabla S^h_n\text {d}s \nonumber \\&-\int _{\Omega }w^h_\text {S}\left( \frac{\beta ^h_n}{N} S^h_nI^h_n +\gamma ^h_n R^h_n \right) \text {d}s \end{aligned}$$

(17)

$$\begin{aligned} \int _{\Omega }w^h_\text {I}\frac{I^h_n - I^h_{n-1}}{\Delta t}\text {d}s =&-\int _{\Omega }{\mathcal {D}}^h_{\text {I}_n}\nabla w^h_\text {I}\cdot \nabla I^h_n\text {d}s\nonumber \\&+\int _{\Omega }w^h_\text {I}\left( \frac{\beta ^h_n}{N}S^h_nI^h_n -\mu ^h_n I^h_n-\alpha ^h_n I^h_n \right) \text {d}s \end{aligned}$$

(18)

$$\begin{aligned} \int _{\Omega }w^h_\text {R}\frac{R^h_n - R^h_{n-1}}{\Delta t}\text {d}s =&-\int _{\Omega }{\mathcal {D}}^h_{\text {R}_n}\nabla w^h_\text {R}\cdot \nabla R^h_n\text {d}s \nonumber \\&+\int _{\Omega }w^h_\text {R}\left( \mu ^h_n I^h_n-\gamma ^h_n R^h_n\right) \text {d}s \end{aligned}$$

(19)

$$\begin{aligned} \int _{\Omega }w^h_\text {D} \frac{D_n - D_{n-1}}{\Delta t}\text {d}s&=\int _{\Omega }w^h_\text {D}\alpha ^h_n I^h_n\text {d}s \end{aligned}$$

(20)

where the parameters are interpolated from nodal variables as defined in Eqs. (15) and (16).

3 System Inference by Field Inversion Using Adjoint-Based Gradient Optimization

The system inference problem is to invert for the quantities $\beta ^a_n,\gamma ^a_n,\mu ^a_n,\alpha ^a_n, {\mathcal {D}}^a_{\text {S}_n},\dots ,{\mathcal {D}}^a_{\text {R}_n}$. Since these quantities are interpolated via Eqs. (15) and (16) to be expressed as the corresponding fields $\beta ^h_n,\gamma ^h_n,\mu ^h_n,\alpha ^h_n, {\mathcal {D}}^h_{\text {S}_n},\dots ,{\mathcal {D}}^h_{\text {R}_n}$ in (17-18), the system inference problems is one of field inversion. It is stated in Eqs. (21-22) as:

Given (15–16), at each

$$\begin{aligned}& t_n:\quad \left( \beta ^a_n,\dots ,{\mathcal {D}}^a_{\text {R}_n}\right) _{a=1}^{n_p}\nonumber \\&\quad = \text {arg }\underset{(\beta ^a_n,\dots ,{\mathcal {D}}^a_{\text {R}_n}){a=1}^{n_p} }{\min }\text { } \ell _{i},\quad \text {such that} (17-20) \mathrm{hold} \end{aligned}$$

(21)

and $\ell _{i}$ is the loss function defined:

$$\begin{aligned} \ell _{i}=&\int _\Omega W_\text {S}\left( S^h_n-S_n^\text {d}\right) ^2+W_\text {I} \left( I^h_n-I_i^\text {d}\right) ^2+W_\text {R}\left( R^h_n-R_n^\text {d}\right) ^2\nonumber \\&+W_\text {D}\left( D^h_n-D_n^\text {d}\right) ^2\text {d}v \end{aligned}$$

(22)

where $(\bullet )^\text {d}$ denotes data for the corresponding quantity. Due to the large differences in the magnitudes of different sub-populations, we choose the weights $W_\text {S},\cdots , W_\text {D}$ to be:

$$\begin{aligned} W_\text {S}=&\frac{I_n^\text {d}}{\text {mean}\left( S_n^\text {d}\right) },\quad W_\text {I}=\frac{I_n^\text {d}}{\text {mean}\left( I_n^\text {d}\right) },\nonumber \\ W_\text {R}=&\frac{I_n^\text {d}}{\text {mean}\left( R_n^\text {d}\right) }, W_\text {D}=\frac{I_n^\text {d}}{\text {mean}\left( D_n^\text {d}\right) }. \end{aligned}$$

(23)

The weights normalize the sub-populations and prioritize regions with higher infected populations. These regions are of greater interest for studying the progression of the disease as they tend to have a higher population density and, therefore, infected populations.

This PDE-constrained optimization problem is solved iteratively, and requires the gradient of the PDE constraint, Eqs. (17-20), with respect to parameters. We adopt classical adjoint-based gradient optimization. This approach involves a single linear solution of the adjoint equation of the original PDE constraint at each iteration, followed by solution of the fields to be inverted: $(\beta ^h_n,\gamma ^h_n,\mu ^h_n,\alpha ^h_n, {\mathcal {D}}^h_{\text {S}_n},\dots ,{\mathcal {D}}^h_{\text {R}_n})$ and the updated forward solution $S^h_n, I^h_n, R^h_n, D^h_n$. In this work we use the L-BFGS-B optimization algorithm from SciPy [30] and the dolfin-adjoint software library [31] to compute the gradient.

4 Data Preparation on Maps of Michigan and Mexico

First, we constructed two-dimensional meshes for Michigan and Mexico that fully resolve the counties/states as shown in Fig. 1. The data are available as cumulative sub-population numbers $I^\text {d}_n, R^\text {d}_n, D^\text {d}_n$ at the county/state level. Note that an individual was considered recovered if they did not die 15 days after their symptoms onset. We adopted this definition based on reporting of compartmental population data in the State of Michigan. Moreover, in Michigan, recovery data was reported at the State level, not by county, so the distribution of recovered cases across counties was approximated to be the same as the distribution of cumulative infected cases across counties. In Mexico, the data reported allowed us to calculate the recovered by entity–states and Mexico City–using this definition. We used a uniform density of each sub-population to compute $I^\text {d}_n, R^\text {d}_n, D^\text {d}_n$ within the county/state, and applied Gaussian filtering to smooth the discontinuities at the county/state boundaries. Note that the discrete Gaussian filter can not be applied in a straightforward manner to unstructured meshes. Starting with a field u that represents any of the four sub-population densities, and $G(\varvec{x}_0,\varvec{x})=\frac{1}{2\pi \sigma ^2}e^{-\frac{||\varvec{x}||^2}{2\sigma ^2}}$ as the two dimensional Gaussian distribution function centered at 0 with standard deviation $\sigma$, which is related to the kernel size in the discrete Gaussian filter, we scale the filtered solution denoted by $u({\varvec{x}}_0)$ at each finite element node:

$$\begin{aligned} u(\varvec{x}_0)=\frac{1}{{\int _\Omega G(\varvec{x}_0,\varvec{x})\text {d}v}}{\int _\Omega G(\varvec{x}_0,\varvec{x})u_\text {raw}(\varvec{x})\text {d}v} \end{aligned}$$

(24)

The spatio-temporal evolution of these fields was used in the system inference problem as described in Sect. 3.

5 Results

Figure 2 shows the sub-populations $S(\varvec{x},t), I(\varvec{x},t), R(\varvec{x},t), D(\varvec{x},t)$ in both Michigan and Mexico obtained by forward simulation with inferred quantities compared with data on December 29, 2020 ($t=281$ days). In the model $t=0$ corresponds to March 23, 2020, the start of the lockdown in Michigan, though the figures show the simulations from $t=15$ to account for the lag introduced by the definition of recovered (See Sect. 4). The inferred model for Michigan accurately replicates the initial burst of disease and the following multiple waves around Detroit (please see the SI movie: michigan_prediction.mp4). It also captures the second burst in the southwest of Michigan around the city of Grand Rapids. The high burden of the disease in these, the largest and second largest cities, respectively, in Michigan, reflects well-known socio-economic challenges related to Detroit in particular, and more generally reflected in other urban centers. Similarly, Mexico City, with highest population density in Mexico ($6,200/\text {km}^2$ [18]), was the worst affected area in that country and dominated the evolution of the disease (See SI movie: mexico_prediction.mp4).

The low error between the simulation and data leads to greater confidence in the inferred parameters. Fig. 3 shows the inferred infection rate, death rate, the recovery rate, and the reproduction number $r_0=\frac{\beta }{\mu }$ in Michigan’s lower peninsula at days $t=15, 70, 140, 210, 281$ (the time-resolved dynamics are shown in SI movie: michigan_parameter.mp4). The evolution of these inferred parameters reveals that the population’s infection rate, $\beta (t)$, declined from the initially higher values in highly infected areas (such as Detroit), and spread to the western parts of Michigan. The death rate was mostly stable after May 2020 ($t>69$), and remained low in the more highly infected areas. This can be attributed to the ramp up of the public health campaign, hospitalizations and emergency response of the medical system, and prioritization to the more highly infected areas. The recovery rate around Detroit city evolved in multiple stages: increasing$\rightarrow$ decreasing $\rightarrow$ increasing, which was consistent with the multiple waves reflected in the data on the recovered population in this region (SI movie: michigan_prediction.mp4). Note that the large heterogeneity of the parameters is because in the PDE SIRD model, the parameters are scaled by division with the population densities, and $\beta$ by the square of the population density. This affects their values. In particular, it should be borne in mind that the effective reproduction number reported here is “per unit population density”. Therefore, a high $r_0$ could be reflective of a low population density. Nevertheless, the actual effective reproduction number could be low in low density regions. Such scaling underlies the high $r_0$ reported in the northwestern part of Michigan’s lower peninsula.

At the finest resolution, the mobility of the population during disease evolution may be approached via agent-based models refined to resolve individuals. However, given the difficulties encountered in effective contact tracing, and its acceptance by the population [9, 32, 33], an intriguing question to explore is whether simple reaction-diffusion models can detect the evidence of mobility in these data. Figure 4 shows the inferred diffusivities of the susceptible, infected, and recovered sub-populations. Note that for field inversion, the population density data for each compartment was taken to be uniform within each county/state, since no finer grained information was available, and then subject to Gaussian smoothing before inference. Thus the density gradients, which drive the inference of diffusivities, arise at the counties/states scale more than they do at the intra-county/intra-state. Accordingly, the inferred diffusivities are meaningful on this scale. The lower Peninsula of Michigan is about 446 km long from north to south and 314 km wide from east to west–scales that can help place the diffusivities in Fig. 4 in perspective. The mobility of the infected population was always high around the highly infected areas. In Michigan, this infected population gradually shifted to the southwestern part of the state from the initial burst around Detroit. This finding is consistent with the second burst around Grand Rapids during the evolution of the pandemic. The recovered population demonstrated a similar pattern of mobility, and was more active in the southern part of Michigan around the more highly infected regions. On the other hand, the susceptible population closely tracks the total population. Since the population at large has low mobility, the susceptible population’s mobility is low in high population density areas. See SI movie michigan_prediction.mp4 for these dynamics.

Figure 5 shows the inferred infection rate, death rate and the recovery rate of the inferred model for Mexico. We can clearly see the spreading of the disease from Mexico City. Similar to the case of Michigan, infection rates, and to a lesser extent, death rates, were relatively lower in Mexico City, which is the most densely populated region of the country, than that in the surrounding cities. The recovery rate was high in Mexico city, in part due to the relatively greater resources of the medical system there. The infection and death rates tended to be stable for five months following March 23, 2020, and the recovery rate gradually increased in more areas. Notably, far from the Mexico city, Baja California also displayed a high inferred rate of infection. We suspect this to be because it borders California, USA, and the international border restrictions did not contain the spread of the virus between the two regions. Unlike Mexico City, the death rate remained high, and the recovery rate did not increase to levels comparable to the capital, perhaps because of the looser restrictions in this popular tourist destination. The reproduction number $r_0$ at $t=15$ (April 7, 2020) was high only around Mexico City. By $t=70$ it increased near the other two most populated cities, Guadalajara in the West, and Monterrey in the Northeast. By $t=140$ (August 20, 2020) less populated areas saw a higher infection rate and reproduction number.

The diffusivities of the corresponding sub-populations of the inferred model for Mexico are shown in Fig. 6. Similar to Michigan, the mobility of infected and recovered sub-populations are higher around the highly infected Mexico City. Mexico is about 3000 km long from north to south and 1900 km wide from east to west–scales that can help place the diffusivities in Fig. 6 in perspective. Unlike the case in Michigan where there were multiple bursts in different cities, the mobilities of all sub-populations became stable after about 5 months from March 23, 2020. This may reflect differences in the proclivity toward domestic/local mobility of the populations of Michigan and Mexico–two regions with strongly contrasting social, economic and cultural characteristics.

Finally, taking the inferred parameters on the last day used for inference (Day 281), we predicted the evolution of sub-populations for three weeks (Days 282 to 303) using the inferred model. Figs. 7 and 8 show the predicted spatio-temporal evolution of the infected population against the raw data for both Michigan and Mexico. The inferred models captured closely the evolution of the infected-sub-populations, indicating that the dynamics of the disease tended to be steady in January 2021. The prediction of recovered and deceased sub-populations are shown in Figs. 9 and 10 under "Appendix".

6 Conclusion

This communication builds upon our previous work [11] on system inference and machine learning from data to study the progression of COVID-19 across the state of Michigan. We extended the PDE SIRD model by allowing the infection rate, death rate and the recovery rate, as well as the diffusivities of the susceptible, infected, and recovered sub-populations to vary over space and time. Using field inversion to infer the parameters as finite-dimensional fields on time scales of a single day, we obtained models to predict the evolution of disease with high accuracy. This provides us with the ability to analyze the dynamics of the disease through the inferred parameters, and make accurate predictions within a reasonable time frame. Particularly, we can detect the evidence of time and spatially varying mobility of the population through the simple diffusion-reaction models instead of the relying on the agent-based models which require individual’s mobility data. The latter can prove challenging, technically as well as politically, to obtain.

As discussed in Sect. 5, our inferred models capture the geographical spread of infection, the number of deaths and the size of the recovered population starting from one highly infected area to its surrounding cities and eventually spreading to further areas. Particularly, the higher infection and death rates in areas with low infection at later times suggests that more attention is needed in such locations. This may be due to a lack of medical services, or a lack of compliance with mitigation strategies. Our inferred models also reveal higher mobility surrounding the highly infected areas suggesting the importance of quarantine and social distancing.

The spread of COVID-19 has exhibited large variations in space and time, and the data has shown that its reproduction is very dependent upon each regional population: its population densities, culture and political environments (e.g. compliance with government mandates, resources, etc.) Our model introduces seven spatio-temporal parameters that, although they can lead to overfitting, are needed to resolve variations and make accurate and specific population predictions over short times of the order of two weeks. In such settings, health policy makers can make decisions and issue mandates by relying on two week predictions in their specific populations. This is what our model was able to achieve.

Finite-dimensional representation allows the parameters to accurately capture the spatial dependence, however the non-parametric representation makes the projection of these parameters beyond the data range extremely challenging. A prediction cannot be made with confidence if the dynamics of the disease reflected by these parameters are not stable. Of course, extrapolation is challenging in almost all data-driven methods. One possible alternate is to develop surrogate models of these parameters via time dependent neural networks under the constraints of the SIRD model to learn the spatial variation in time, and thus to make reasonable prediction of the dynamics in the evolution the disease, such as we have demonstrated previously [11]. Nevertheless, without including factors such as mobility restrictions or other mandates, only short time predictions may be accurate.

References

Kermack WO, McKendrick AG (1927) A contribution to the mathematical theory of epidemics. Proc R Soc Lond Ser A 115:700–721
Article Google Scholar
Eisenberg MC, Eisenberg JNS, D’Silva JP, Wells EV Cherng S, Kao Yu-H, Meza R (2015) Forecasting and uncertainty in modeling the 2014–2015 ebola epidemic in West Africa
Eisenberg M, Kujbida G, Tuite RA, Fisman ND, Tien Joseph H (2013) Examining rainfall and cholera dynamics in haiti using statistical and dynamic modeling approaches. Epidemics 5:197–207. https://doi.org/10.1016/j.epidem.2013.09.004
Article Google Scholar
Amy W, Eagle N, Tatem AJ, Smith DL, Noor AM, Snow RW, Buckee CO (2012) Quantifying the impact of human mobility on malaria. Science 338:267–270. https://doi.org/10.1126/science.1223467
Article Google Scholar
Colizza V, Barrat A, Barthelemy M, Valleron AJ, Vespignani Alessandro (2007) Modeling the worldwide spread of pandemic influenza: baseline case and containment interventions. PLoS Med 4:e13. https://doi.org/10.1371/journal.pmed.0040013
Article Google Scholar
Hethcote Herbert W (2000) The mathematics of infectious diseases. SIAM Rev 42(4):599–653
Article MathSciNet Google Scholar
Elizabeth H, Mac NB, John K (2017) A taxonomy for agent-based models in human infectious disease epidemiology. J Artif Soc Soc Simul 20 (3):2. https://doi.org/10.18564/jasss.3414. URL http://jasss.soc.surrey.ac.uk/20/3/2.html
Viguerie A, Veneziani A, Lorenzo G, Baroli D, Aretz-Nellesen N, Patton A, Yankeelov TE, Reali A, Hughes TJR, Auricchio F (2020) Diffusion-reaction compartmental models formulated in a continuum mechanics framework: application to covid-19, mathematical analysis, and numerical study. Comput Mech 66:1131–1152, 2020. https://doi.org/10.1007/s00466-020-01888-0
Zohdi TI (2020) An agent-based computational framework for simulation of global pandemic and social response on planet x. Comput Mech 66:1195–1209. https://doi.org/10.1007/s00466-020-01886-2
Article MathSciNet MATH Google Scholar
Chang L, Duan M, Sun G, Jin Z (2020) Cross-diffusion-induced patterns in an sir epidemic model on complex networks. Chaos Interdiscip J Nonlinear Sci 30(1):013147. https://doi.org/10.1063/1.5135069
Article MathSciNet MATH Google Scholar
Wang Z, Zhang X, Teichert GH, Carrasco-Teja M, Garikipati K (2020) System inference for the spatio-temporal evolution of infectious diseases: Michigan in the time of COVID-19. Comput Mech 66(5):1153–1176
Article MathSciNet Google Scholar
1Point3Acres.com. URL https://coronavirus.1point3acres.com/en
Yang T, Shen K, He S, Li E, Sun P, Chen P, Zuo L, Hu J, Mo Y, Zhang W, Zhang H, Chen J, Guo Y (2020) Covidnet: to bring data transparency in the era of covid-19
Johns Hopkins University of Medicine. COVID-19 dashboard by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU). URL https://coronavirus.jhu.edu/map.html
Michigan state coronavirus data. URL https://www.michigan.gov/coronavirus/
The New York Times. Coronavirus in the U.S.: latest map and case count—the New York Times. URL https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html
The Institute for Health Metrics and Evaluation. COVID-19 Projections. URL https://covid19.healthdata.org/united-states-of-america
Inegi: Censo de población y vivienda. URL https://www.inegi.org.mx/programas/ccpv/2020/#Tabulados
Conacyt: Covid-19 méxico. URL https://datos.covid-19.conacyt.mx/#DownZCSV
Wang Z, Huan X, Garikipati K (2019) Variational system identification of the partial differential equations governing the physics of pattern-formation: inference under varying fidelity and noise. Comput Methods Appl Mech Eng 356:44 – 74, ISSN 0045-7825. https://doi.org/10.1016/j.cma.2019.07.007
Wang Z, Wu B, Garikipati K, Huan X (2020) A perspective on regression and Bayesian approaches for system identification of pattern formation dynamics. Theor Appl Mech Lett 10(3):188–194
Article Google Scholar
Wang Z, Huan X, Garikipati K (2021) Variational system identification of the partial differential equations governing microstructure evolution in materials: inference over sparse and spatially unrelated data. Comput Methods Appl Mech Eng 377:113706, ISSN 0045-7825. https://doi.org/10.1016/j.cma.2021.113706. URL https://www.sciencedirect.com/science/article/pii/S0045782521000426
Wang Z, Martin B, Weickenmeier J, Garikipati K (2021) An inverse modelling study on the local volume changes during early morphoelastic growth of the fetal human brain. Brain Multiphys 2:100023. ISSN 2666-5220. https://doi.org/10.1016/j.brain.2021.100023. URL https://www.sciencedirect.com/science/article/pii/S2666522021000034
Wang Z, Estrada JB, Arruda EM, Garikipati K (2020) Discovery of deformation mechanisms and constitutive response of soft material surrogates of biological tissue by full-field characterization and data-driven variational system identification. J Mech Phys Solids. https://doi.org/10.1101/2020.10.13.337964. bioRxiv
Teichert GH, Garikipati K (2019) Machine learning materials physics: surrogate optimization and multi-fidelity algorithms predict precipitate morphology in an alternative to phase field dynamics. Comput Methods Appl Mech Eng 344:666–693
Article MathSciNet Google Scholar
Teichert GH, Natarajan AR, der Ven AV, Garikipati K (2019) Machine learning materials physics: integrable deep neural networks enable scale bridging by learning free energy functions. Comput Methods Appl Mech Eng 353:201–216. ISSN 0045-7825. https://doi.org/10.1016/j.cma.2019.05.019. URL http://www.sciencedirect.com/science/article/pii/S0045782519302889
Teichert GH, Natarajan AR, der Ven AV, Garikipati K (2020) Scale bridging materials physics Active learning workflows and integrable deep neural networks for free energy function representations in alloys. Comput Methods Appl Mech Eng 371:113281
Article MathSciNet Google Scholar
Zhang Xiaoxuan, Garikipati Krishna (2020) Machine learning materials physics: multi-resolution neural networks learn the free energy and nonlinear elastic response of evolving microstructures. Comput Methods Appl Mech Eng 372:113362. https://doi.org/10.1016/j.cma.2020.113362
Article MathSciNet MATH Google Scholar
Zhang X, Garikipati K (2021) Bayesian neural networks for weak solution of pdes with uncertainty quantification
Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J, van der Walt SJ, Brett M, Wilson J, Millman KJ, Mayorov N, Nelson ARJ, Jones E, Kern R, Larson E, Carey CJ, Polat İ, Feng Y, Moore EW, Vand erPlas J, Laxalde D, Perktold J, Cimrman R, Henriksen I, Quintero EA, Harris CR, Archibald AM, Ribeiro AH, Pedregosa F, van Mulbregt P, SciPy 1. 0 Contributors. SciPy 1.0. (2020) Fundamental algorithms for scientific computing in python. Nat Methods 17:261–272. https://doi.org/10.1038/s41592-019-0686-2
Mitusch SK, Funke SW, Dokken JS (2019) Dolfin-adjoint 2018.1: automated adjoints for fenics and firedrake. J Open Source Softw 4(38):1292. https://doi.org/10.21105/joss.01292
Article Google Scholar
Chan EY, Saqib NU (2021) Privacy concerns can explain unwillingness to download and use contact tracing apps when covid-19 concerns are high. Comput Human Behav 119:106718, ISSN 0747-5632. https://doi.org/10.1016/j.chb.2021.106718. URL https://www.sciencedirect.com/science/article/pii/S0747563221000406
Kretzschmar ME, Rozhnova G, Bootsma MCJ, van Boven M, van de Wijgert JHHM, Bonten MJM (2020) Impact of delays on effectiveness of contact tracing strategies for covid-19: a modelling study. Lancet Public Health 5 (8):e452–e459 ISSN 2468-2667. https://doi.org/10.1016/S2468-2667(20)30157-2. URL https://www.sciencedirect.com/science/article/pii/S2468266720301572

Download references

Acknowledgements

We acknowledge the support of Defense Advanced Research Projects Agency (DARPA) under Agreement No. HR0011199002, “Artificial Intelligence guided multi-scale multi-physics framework for discovering complex emergent materials phenomena”

Author information

Authors and Affiliations

Mechanical Engineering , University of Michigan, Ann Arbor, MI, USA
Zhenlin Wang, Xiaoxuan Zhang, Gregory H. Teichert & Krishna Garikipati
Mathematics, University of Michigan, Ann Arbor, MI, USA
Mariana Carrasco-Teja & Krishna Garikipati
Michigan Institute for Computational Discovery and Engineering, University of Michigan, Ann Arbor, MI, USA
Krishna Garikipati

Authors

Zhenlin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Mariana Carrasco-Teja
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoxuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Gregory H. Teichert
View author publications
You can also search for this author in PubMed Google Scholar
Krishna Garikipati
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Krishna Garikipati.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 3737 KB)

Supplementary material 2 (mp4 2311 KB)

Supplementary material 3 (mp4 7553 KB)

Supplementary material 4 (mp4 2881 KB)

A Appendix: Additional Results

See Figs. 9 and 10.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Z., Carrasco-Teja, M., Zhang, X. et al. System Inference Via Field Inversion for the Spatio-Temporal Progression of Infectious Diseases: Studies of COVID-19 in Michigan and Mexico. Arch Computat Methods Eng 28, 4283–4295 (2021). https://doi.org/10.1007/s11831-021-09643-1

Download citation

Received: 29 April 2021
Accepted: 24 August 2021
Published: 01 October 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s11831-021-09643-1

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

System Inference Via Field Inversion for the Spatio-Temporal Progression of Infectious Diseases: Studies of COVID-19 in Michigan and Mexico

Abstract

Similar content being viewed by others

Spatio-temporal predictive modeling framework for infectious disease spread

Inference on the dynamics of COVID-19 in the United States

Assessing the Spatio-temporal Spread of COVID-19 via Compartmental Models with Diffusion in Italy, USA, and Brazil

1 Introduction

2 Compartmental Differential Equations Models of Infectious Disease Dynamics

3 System Inference by Field Inversion Using Adjoint-Based Gradient Optimization

4 Data Preparation on Maps of Michigan and Mexico

5 Results

6 Conclusion

References

Acknowledgements