1 Introduction

Classical mathematical treatments of epidemiology, such as the Susceptible-Infected-Recovered (SIRD) model [1], are ordinary differential equations (ODEs) defined by specifying the compartmental sub-population numbers over some geographical region. Spatial effects have typically been introduced by resolving smaller regions and treating them individually [2,3,4,5,6]. During long-lasting and widespread epidemics, such as the COVID-19 Pandemic, the effects on the infection rate of imposing–and then lifting–mobility restrictions and social distancing mandates revolve on the question of the time and spatially varying mobility of the population. At the finest resolution, this must be approached via agent-based models [7], using individuals’ mobility data. However, this data is not available for the entire population, and contact tracing campaigns face challenges of recruiting workers, access, technology, as well as socio-political resistance. Against these difficulties, an intriguing question to explore is whether simple reaction-diffusion models can detect the evidence of mobility in these data. Such an approach must start with a partial differential equation (PDE) version of the epidemiological models, which is easily defined by converting compartmental sub-populations to densities over sub regions by normalizing with the corresponding areas. To address the mobility of the population, diffusion terms are introduced to the SIRD model, which is transformed to a set of reaction-diffusion PDEs in two spatial dimensions [8,9,10,11].

The widespread availability of data in the public domain [12,13,14,15,16,17,18,19] has spurred significant interest among computational and data scientists, who have sought to test and refine their methods against these repositories. This has opened up the possibility that advances in computational and data science may contribute to the existing and rapidly expanding body of work in epidemiology, in inferring the dynamics of COVID-19 and making projections. We have similarly sought to build off our recent work in data-driven and machine learning approaches [20,21,22,23,24,25,26,27,28,29] and presented a class of system identification techniques for inference of ODE and PDE forms of the SIRD model, as well as Bayesian neural networks for representation and uncertainty quantification-guided prediction [11]. That work focused on the US state of Michigan. In this communication, we revise our approach for inference of the PDE SIRD model with temporally and spatially evolving parameters and diffusivities. Importantly, instead of global polynomial representations of PDE SIRD parameters over the spatial and temporal domains, we adopt field inversion over time intervals that coincide with the time steps of our underlying numerical implementation. This affords much greater accuracy over the global polynomial ansatz. Adjoint-based gradient optimization for field inversion of parameters at each time step replaces the use of stepwise regression-based system identification in our previous work. We find that the improved accuracy with respect to the data over the time interval of inference, as well as of the predictions, is worth the increased expense. We have brought abundant high-quality, public domain, data [12,13,14,15,16,17,18,19] on the evolution of COVID-19 in both, the State of Michigan, with a population of 9.98 million, distributed in 83 counties, over 250,493 km\(^2\), and the country of Mexico, with a population of 126 million, distributed in 32 geographical entities (31 states and Mexico City), on 1,972,550 km\(^2\). The temporal resolution by days and spatial resolution by counties/states have allowed us to study the mobility in these data using our methods of system inference.

In Sect. 2 we review our previous work of system inference for the spatio-temporal SIRD model first, and then extend it by incorporating temporal and spatial parameters and diffusivities using a finite element representation. Our PDE SIRD model-constrained inference approach is presented in Sect. 3. Section 4 is on data preparation. The results for inference of classical SIRD parameters as well as the diffusivities, and forward prediction are presented in Sect. 5. Our conclusions appear in Sect. 6.

2 Compartmental Differential Equations Models of Infectious Disease Dynamics

We begin with the conventional SIRD compartmental epidemiology model. The population, taken to remain constant at N, is divided into four disjoint compartments with time-dependent sub-populations: S(t) for susceptible, I(t) for infected, R(t) for recovered and D(t) for deceased individuals. The governing system of ordinary differential equations (ODEs) is:

$$\begin{aligned} \frac{\text {d} S}{\text {d} t}&=-\frac{\beta }{N} SI+\gamma R \end{aligned}$$
(1)
$$\begin{aligned} \frac{\text {d}I}{\text {d}t}&=\frac{\beta }{N}SI-\mu I-\alpha I\end{aligned}$$
(2)
$$\begin{aligned} \frac{\text {d}R}{\text {d}t}&=\mu I-\gamma R\end{aligned}$$
(3)
$$\begin{aligned} \frac{\text {d}D}{\text {d}t}&=\alpha I\end{aligned}$$
(4)
$$\begin{aligned} N&= S(t) + I(t) + R(t) + D(t). \end{aligned}$$
(5)

This is the canonical form of the model where the sub-populations are assumed to be well-mixed so that spatial variations can be ignored over the domain of interest. Here \(\beta\) is the infection rate, \(\mu\) is the recovery rate, \(\gamma\) is the rate of immunity loss, and \(\alpha\) is the death rate.

We have extended the SIRD model to a system of partial differential equations (PDEs) in two spatial dimensions using the same compartments [11]. However, the population variables are now replaced with spatio-temporally varying densities, \({\widehat{S}}({\varvec{x}},t),{\widehat{I}}({\varvec{x}},t),{\widehat{R}}({\varvec{x}},t),{\widehat{D}}({\varvec{x}},t)\) defined as numbers per unit area.

$$\begin{aligned} \frac{\partial {\widehat{S}}}{\partial t}&={\mathcal {D}}_\text {S}\nabla ^2 {\widehat{S}}-\frac{\beta }{{\widehat{N}}} {\widehat{S}}{\widehat{I}}+\gamma {\widehat{R}} \end{aligned}$$
(6)
$$\begin{aligned} \frac{\partial {\widehat{I}}}{\partial t}&={\mathcal {D}}_\text {I}\nabla ^2 {\widehat{I}}+\frac{\beta }{{\widehat{N}}}{\widehat{S}}{\widehat{I}}-\mu {\widehat{I}}-\alpha {\widehat{I}} \end{aligned}$$
(7)
$$\begin{aligned} \frac{\partial {\widehat{R}}}{\partial t}&={\mathcal {D}}_\text {R}\nabla ^2 {\widehat{R}}+\mu {\widehat{I}}-\gamma {\widehat{R}} \end{aligned}$$
(8)
$$\begin{aligned} \frac{\partial {\widehat{D}}}{\partial t}&=\alpha {\widehat{I}} \end{aligned}$$
(9)

where \({\mathcal {D}}_\text {S}, {\mathcal {D}}_\text {I}, {\mathcal {D}}_\text {R}\) are diffusivities of the corresponding compartments, and represent the mobility of the sub-population via random walks. We define \(\widehat{(\bullet )}=(\bullet )/\int _{\Omega }\text {d}A\) where \(\Omega\) is the domain of study: either the lower peninsula of the State of Michigan, or the territory of the country of Mexico. Furthermore the population constraint holds: \(\int _{\Omega }{\widehat{N}}\text {d}A = \int _{\Omega }{\widehat{S}}(t)\text {d}A + \int _{\Omega }{\widehat{I}}(t)\text {d}A + \int _{\Omega }{\widehat{R}}(t)\text {d}A + \int _{\Omega }{\widehat{D}}(t)\text {d}A\). In what follows of this communication, we only consider the PDE SIRD model, and, for the sake of readability, we dispense with the hats on the compartments.

We adopt the weak form, and specifically, the finite element framework for the above system of PDEs. For a generic, finite-dimensional field \(u^h\), the problem is stated as follows: find \(u^h\in {\mathscr {S}}^h \subset {\mathscr {S}}\), where \({\mathscr {S}}^h= \{ u^h \in {\mathscr {H}}^1(\Omega ) ~\vert ~u^h = ~{\bar{u}}\; \mathrm {on}\; \Gamma ^u\}\), such that \(\forall ~w^h \in {\mathscr {V}}^h \subset {\mathscr {V}}\), where \({\mathscr {V}}^h= \{ w^h \in {\mathscr {H}}^1(\Omega )~\vert ~w^h = ~0 \;\mathrm {on}\; \Gamma ^u\}\), the finite-dimensional (Galerkin) weak form of the problem is satisfied. The variations \(w^h\) and trial solutions \(u^h\) are defined component-wise using a finite number of basis functions,

$$\begin{aligned} w^h = \sum _{a=1}^{n_\mathrm {b}} c^a N^a, \quad \qquad u^h = \sum _{a=1}^{n_\mathrm {b}} d^a N^a, \end{aligned}$$
(10)

where \(n_\mathrm {b}\) is the dimensionality of the function spaces \({\mathscr {S}}^h\) and \({\mathscr {V}}^h\), and \(N^a\) represents the basis functions. To obtain the Galerkin weak forms, we multiply each equation in strong form in (6-9) by a weighting function \(w_\text {S}^h,w_\text {I}^h,w_\text {R}^h,w_\text {D}^h\), respectively, integrate by parts, apply boundary conditions appropriately, and use the Backward Euler method for time-discretization with \((\bullet )_n\) denoting a discretized quantity at time \(t_n\) and \(\Delta t\) being the time step. See Ref. [11] for details. This leads to:

$$\begin{aligned} \int _{\Omega }w^h_\text {S} \frac{S^h_{n} - S^h_{n-1}}{\Delta t} \text {d}s =&-\int _{\Omega } {\mathcal {D}}_\text {S}\nabla w^h_\text {S}\cdot \nabla S^h_n\text {d}s\nonumber \\&-\int _{\Omega }w^h_\text {S}\left( \frac{\beta }{N} S^h_nI^h_n+\gamma R^h_n \right) \text {d}s \end{aligned}$$
(11)
$$\begin{aligned} \int _{\Omega }w^h_\text {I}\frac{I^h_n - I^h_{n-1}}{\Delta t}\text {d}s =&-\int _{\Omega }{\mathcal {D}}_\text {I}\nabla w^h_\text {I}\cdot \nabla I^h_n\text {d}s\nonumber \\&+\int _{\Omega }w^h_\text {I}\left( \frac{\beta }{N}S^h_nI^h_n-\mu I^h_n-\alpha I^h_n \right) \text {d}s \end{aligned}$$
(12)
$$\begin{aligned} \int _{\Omega }w^h_\text {R}\frac{R^h_n - R^h_{n-1}}{\Delta t}\text {d}s =&-\int _{\Omega }{\mathcal {D}}_\text {R}\nabla w^h_\text {R}\cdot \nabla R^h_n\text {d}s \nonumber \\&+\int _{\Omega }w^h_\text {R}\left( \mu I^h_n-\gamma R^h_n\right) \text {d}s \end{aligned}$$
(13)
$$\begin{aligned} \int _{\Omega }w^h_\text {D} \frac{D_n - D_{n-1}}{\Delta t}\text {d}s =&\int _{\Omega }w^h_\text {D}\alpha I^h_n\text {d}s \end{aligned}$$
(14)

where the boundary terms vanish because we assume that the sub-populations do not leave the region, thus enforcing zero flux boundary conditions.

In our previous work [11], we have characterized the coefficients to vary via a global-in-time polynomial basis. While the inferred model reproduced the trends, there was a notable error over time of the statewide sub-population estimates \(S(t,\varvec{x}), I(t,\varvec{x}), R(t,\varvec{x}), D(t,\varvec{x})\) obtained by forward simulation with inferred quantities (See Figs. 14 and 15 in [11]).

Fig. 1
figure 1

Reconstructed maps of Michigan State and Mexico with finite element mesh

Additionally, the highly complex geometry of the State of Michigan, and of Mexico (See maps in Fig. 1), and potentially highly nonuniform distributions of the coefficients in space makes it challenging to characterize them with simple basis functions. Global polynomials in space could not sufficiently resolve the emergence and disappearance of “hot spots" and “cold spots" [11]. In this communication, we allow the coefficients \(\beta ,\gamma ,\mu ,\alpha , {\mathcal {D}}_\text {S},\dots ,{\mathcal {D}}_\text {R}\) of the PDE SIRD model to vary over space via finite-dimensional, locally supported representations as we do for the primary variables \(S(t,\varvec{x}), I(t,\varvec{x}), R(t,\varvec{x}), D(t,\varvec{x})\). Further more, we allow the coefficients to vary daily, leading to:

$$\begin{aligned} \beta ^h_n =&\sum _{a=1}^{n_p} \beta ^a_n N^a, \qquad \gamma ^h_n = \sum _{a=1}^{n_p} \gamma ^a_n N^a, \nonumber \\ \mu ^h_n =&\sum _{a=1}^{n_p} \mu ^a_n N^a \qquad \alpha ^h_n = \sum _{a=1}^{n_p} \alpha ^a_n N^a \end{aligned}$$
(15)
$$\begin{aligned} {\mathcal {D}}_{\text {S}_n}^h =&\sum _{a=1}^{n_p} {\mathcal {D}}_{\text {S}_n}^a N^a,\qquad {\mathcal {D}}_{\text {I}_n}^h = \sum _{a=1}^{n_p} {\mathcal {D}}_{\text {I}_n}^a N^a,\nonumber \\ {\mathcal {D}}_{\text {R}_n}^h =&\sum _{a=1}^{n_p} {\mathcal {D}}_{\text {R}_n}^a N^a \end{aligned}$$
(16)

where, as for the primary variables, the subscripts \((\bullet )n\) denote the coefficients on day n. With this, the PDE SIRD equations become:

$$\begin{aligned} \int _{\Omega }w^h_\text {S} \frac{S^h_{n} - S^h_{n-1}}{\Delta t} \text {d}s =&-\int _{\Omega } {\mathcal {D}}^h_{\text {S}_n}\nabla w^h_\text {S}\cdot \nabla S^h_n\text {d}s \nonumber \\&-\int _{\Omega }w^h_\text {S}\left( \frac{\beta ^h_n}{N} S^h_nI^h_n +\gamma ^h_n R^h_n \right) \text {d}s \end{aligned}$$
(17)
$$\begin{aligned} \int _{\Omega }w^h_\text {I}\frac{I^h_n - I^h_{n-1}}{\Delta t}\text {d}s =&-\int _{\Omega }{\mathcal {D}}^h_{\text {I}_n}\nabla w^h_\text {I}\cdot \nabla I^h_n\text {d}s\nonumber \\&+\int _{\Omega }w^h_\text {I}\left( \frac{\beta ^h_n}{N}S^h_nI^h_n -\mu ^h_n I^h_n-\alpha ^h_n I^h_n \right) \text {d}s \end{aligned}$$
(18)
$$\begin{aligned} \int _{\Omega }w^h_\text {R}\frac{R^h_n - R^h_{n-1}}{\Delta t}\text {d}s =&-\int _{\Omega }{\mathcal {D}}^h_{\text {R}_n}\nabla w^h_\text {R}\cdot \nabla R^h_n\text {d}s \nonumber \\&+\int _{\Omega }w^h_\text {R}\left( \mu ^h_n I^h_n-\gamma ^h_n R^h_n\right) \text {d}s \end{aligned}$$
(19)
$$\begin{aligned} \int _{\Omega }w^h_\text {D} \frac{D_n - D_{n-1}}{\Delta t}\text {d}s&=\int _{\Omega }w^h_\text {D}\alpha ^h_n I^h_n\text {d}s \end{aligned}$$
(20)

where the parameters are interpolated from nodal variables as defined in Eqs. (15) and (16).

3 System Inference by Field Inversion Using Adjoint-Based Gradient Optimization

The system inference problem is to invert for the quantities \(\beta ^a_n,\gamma ^a_n,\mu ^a_n,\alpha ^a_n, {\mathcal {D}}^a_{\text {S}_n},\dots ,{\mathcal {D}}^a_{\text {R}_n}\). Since these quantities are interpolated via Eqs. (15) and (16) to be expressed as the corresponding fields \(\beta ^h_n,\gamma ^h_n,\mu ^h_n,\alpha ^h_n, {\mathcal {D}}^h_{\text {S}_n},\dots ,{\mathcal {D}}^h_{\text {R}_n}\) in (17-18), the system inference problems is one of field inversion. It is stated in Eqs. (21-22) as:

Given (1516), at each

$$\begin{aligned}& t_n:\quad \left( \beta ^a_n,\dots ,{\mathcal {D}}^a_{\text {R}_n}\right) _{a=1}^{n_p}\nonumber \\&\quad = \text {arg }\underset{(\beta ^a_n,\dots ,{\mathcal {D}}^a_{\text {R}_n}){a=1}^{n_p} }{\min }\text { } \ell _{i},\quad \text {such that} (17-20) \mathrm{hold} \end{aligned}$$
(21)

and \(\ell _{i}\) is the loss function defined:

$$\begin{aligned} \ell _{i}=&\int _\Omega W_\text {S}\left( S^h_n-S_n^\text {d}\right) ^2+W_\text {I} \left( I^h_n-I_i^\text {d}\right) ^2+W_\text {R}\left( R^h_n-R_n^\text {d}\right) ^2\nonumber \\&+W_\text {D}\left( D^h_n-D_n^\text {d}\right) ^2\text {d}v \end{aligned}$$
(22)

where \((\bullet )^\text {d}\) denotes data for the corresponding quantity. Due to the large differences in the magnitudes of different sub-populations, we choose the weights \(W_\text {S},\cdots , W_\text {D}\) to be:

$$\begin{aligned} W_\text {S}=&\frac{I_n^\text {d}}{\text {mean}\left( S_n^\text {d}\right) },\quad W_\text {I}=\frac{I_n^\text {d}}{\text {mean}\left( I_n^\text {d}\right) },\nonumber \\ W_\text {R}=&\frac{I_n^\text {d}}{\text {mean}\left( R_n^\text {d}\right) }, W_\text {D}=\frac{I_n^\text {d}}{\text {mean}\left( D_n^\text {d}\right) }. \end{aligned}$$
(23)

The weights normalize the sub-populations and prioritize regions with higher infected populations. These regions are of greater interest for studying the progression of the disease as they tend to have a higher population density and, therefore, infected populations.

This PDE-constrained optimization problem is solved iteratively, and requires the gradient of the PDE constraint, Eqs. (17-20), with respect to parameters. We adopt classical adjoint-based gradient optimization. This approach involves a single linear solution of the adjoint equation of the original PDE constraint at each iteration, followed by solution of the fields to be inverted: \((\beta ^h_n,\gamma ^h_n,\mu ^h_n,\alpha ^h_n, {\mathcal {D}}^h_{\text {S}_n},\dots ,{\mathcal {D}}^h_{\text {R}_n})\) and the updated forward solution \(S^h_n, I^h_n, R^h_n, D^h_n\). In this work we use the L-BFGS-B optimization algorithm from SciPy [30] and the dolfin-adjoint software library [31] to compute the gradient.

4 Data Preparation on Maps of Michigan and Mexico

First, we constructed two-dimensional meshes for Michigan and Mexico that fully resolve the counties/states as shown in Fig. 1. The data are available as cumulative sub-population numbers \(I^\text {d}_n, R^\text {d}_n, D^\text {d}_n\) at the county/state level. Note that an individual was considered recovered if they did not die 15 days after their symptoms onset. We adopted this definition based on reporting of compartmental population data in the State of Michigan. Moreover, in Michigan, recovery data was reported at the State level, not by county, so the distribution of recovered cases across counties was approximated to be the same as the distribution of cumulative infected cases across counties. In Mexico, the data reported allowed us to calculate the recovered by entity–states and Mexico City–using this definition. We used a uniform density of each sub-population to compute \(I^\text {d}_n, R^\text {d}_n, D^\text {d}_n\) within the county/state, and applied Gaussian filtering to smooth the discontinuities at the county/state boundaries. Note that the discrete Gaussian filter can not be applied in a straightforward manner to unstructured meshes. Starting with a field u that represents any of the four sub-population densities, and \(G(\varvec{x}_0,\varvec{x})=\frac{1}{2\pi \sigma ^2}e^{-\frac{||\varvec{x}||^2}{2\sigma ^2}}\) as the two dimensional Gaussian distribution function centered at 0 with standard deviation \(\sigma\), which is related to the kernel size in the discrete Gaussian filter, we scale the filtered solution denoted by \(u({\varvec{x}}_0)\) at each finite element node:

$$\begin{aligned} u(\varvec{x}_0)=\frac{1}{{\int _\Omega G(\varvec{x}_0,\varvec{x})\text {d}v}}{\int _\Omega G(\varvec{x}_0,\varvec{x})u_\text {raw}(\varvec{x})\text {d}v} \end{aligned}$$
(24)

The spatio-temporal evolution of these fields was used in the system inference problem as described in Sect. 3.

5 Results

Figure 2 shows the sub-populations \(S(\varvec{x},t), I(\varvec{x},t), R(\varvec{x},t), D(\varvec{x},t)\) in both Michigan and Mexico obtained by forward simulation with inferred quantities compared with data on December 29, 2020 (\(t=281\) days). In the model \(t=0\) corresponds to March 23, 2020, the start of the lockdown in Michigan, though the figures show the simulations from \(t=15\) to account for the lag introduced by the definition of recovered (See Sect. 4). The inferred model for Michigan accurately replicates the initial burst of disease and the following multiple waves around Detroit (please see the SI movie: michigan_prediction.mp4). It also captures the second burst in the southwest of Michigan around the city of Grand Rapids. The high burden of the disease in these, the largest and second largest cities, respectively, in Michigan, reflects well-known socio-economic challenges related to Detroit in particular, and more generally reflected in other urban centers. Similarly, Mexico City, with highest population density in Mexico (\(6,200/\text {km}^2\) [18]), was the worst affected area in that country and dominated the evolution of the disease (See SI movie: mexico_prediction.mp4).

Fig. 2
figure 2

Comparison of the data on distributions of the susceptible (S), infected (I), recovered (R) and deceased (D) sub-populations against forward PDE SIRD simulations with inferred quantities respectively. Data and simulation results are shown for Day 281 starting from March 23, 2020. The inset adjacent to the map of Mexico is a magnified view of the region surrounding Mexico City. The populations are shown per unit square kilometers. The time-resolved dynamics may be seen as movies in Supplementary Information: michigan_prediction.mp4 and mexico_prediction.mp4

The low error between the simulation and data leads to greater confidence in the inferred parameters. Fig. 3 shows the inferred infection rate, death rate, the recovery rate, and the reproduction number \(r_0=\frac{\beta }{\mu }\) in Michigan’s lower peninsula at days \(t=15, 70, 140, 210, 281\) (the time-resolved dynamics are shown in SI movie: michigan_parameter.mp4). The evolution of these inferred parameters reveals that the population’s infection rate, \(\beta (t)\), declined from the initially higher values in highly infected areas (such as Detroit), and spread to the western parts of Michigan. The death rate was mostly stable after May 2020 (\(t>69\)), and remained low in the more highly infected areas. This can be attributed to the ramp up of the public health campaign, hospitalizations and emergency response of the medical system, and prioritization to the more highly infected areas. The recovery rate around Detroit city evolved in multiple stages: increasing\(\rightarrow\) decreasing \(\rightarrow\) increasing, which was consistent with the multiple waves reflected in the data on the recovered population in this region (SI movie: michigan_prediction.mp4). Note that the large heterogeneity of the parameters is because in the PDE SIRD model, the parameters are scaled by division with the population densities, and \(\beta\) by the square of the population density. This affects their values. In particular, it should be borne in mind that the effective reproduction number reported here is “per unit population density”. Therefore, a high \(r_0\) could be reflective of a low population density. Nevertheless, the actual effective reproduction number could be low in low density regions. Such scaling underlies the high \(r_0\) reported in the northwestern part of Michigan’s lower peninsula.

Fig. 3
figure 3

Inferred infection rate (\(\beta\)), death rate(\(\alpha\)), recovery rate (\(\mu\)) and effective reproduction rate (\(r_0\)) over the lower peninsula of Michigan starting from the lockdown on March 23, 2020. The time-resolved dynamics may be seen as movies in Supplementary Information: michigan_parameter.mp4

At the finest resolution, the mobility of the population during disease evolution may be approached via agent-based models refined to resolve individuals. However, given the difficulties encountered in effective contact tracing, and its acceptance by the population [9, 32, 33], an intriguing question to explore is whether simple reaction-diffusion models can detect the evidence of mobility in these data. Figure 4 shows the inferred diffusivities of the susceptible, infected, and recovered sub-populations. Note that for field inversion, the population density data for each compartment was taken to be uniform within each county/state, since no finer grained information was available, and then subject to Gaussian smoothing before inference. Thus the density gradients, which drive the inference of diffusivities, arise at the counties/states scale more than they do at the intra-county/intra-state. Accordingly, the inferred diffusivities are meaningful on this scale. The lower Peninsula of Michigan is about 446 km long from north to south and 314 km wide from east to west–scales that can help place the diffusivities in Fig. 4 in perspective. The mobility of the infected population was always high around the highly infected areas. In Michigan, this infected population gradually shifted to the southwestern part of the state from the initial burst around Detroit. This finding is consistent with the second burst around Grand Rapids during the evolution of the pandemic. The recovered population demonstrated a similar pattern of mobility, and was more active in the southern part of Michigan around the more highly infected regions. On the other hand, the susceptible population closely tracks the total population. Since the population at large has low mobility, the susceptible population’s mobility is low in high population density areas. See SI movie michigan_prediction.mp4 for these dynamics.

Fig. 4
figure 4

Inferred diffusivities of the susceptible, infected, and recovered sub-populations in Michigan starting from lockdown, March 23, 2020. The lower Peninsula of Michigan is about 446 km long from north to south and 314 km wide from east to west–scales that can help place the diffusivities in perspective. Time-resolved dynamics maybe seen as movies in Supplementary Information: mexico_parameter.mp4

Figure 5 shows the inferred infection rate, death rate and the recovery rate of the inferred model for Mexico. We can clearly see the spreading of the disease from Mexico City. Similar to the case of Michigan, infection rates, and to a lesser extent, death rates, were relatively lower in Mexico City, which is the most densely populated region of the country, than that in the surrounding cities. The recovery rate was high in Mexico city, in part due to the relatively greater resources of the medical system there. The infection and death rates tended to be stable for five months following March 23, 2020, and the recovery rate gradually increased in more areas. Notably, far from the Mexico city, Baja California also displayed a high inferred rate of infection. We suspect this to be because it borders California, USA, and the international border restrictions did not contain the spread of the virus between the two regions. Unlike Mexico City, the death rate remained high, and the recovery rate did not increase to levels comparable to the capital, perhaps because of the looser restrictions in this popular tourist destination. The reproduction number \(r_0\) at \(t=15\) (April 7, 2020) was high only around Mexico City. By \(t=70\) it increased near the other two most populated cities, Guadalajara in the West, and Monterrey in the Northeast. By \(t=140\) (August 20, 2020) less populated areas saw a higher infection rate and reproduction number.

Fig. 5
figure 5

Inferred infection rate (\(\beta\)), death rate(\(\alpha\)), recovery rate (\(\mu\)), effective reproduction rate (\(r_0\)) in Mexico starting from lockdown, March 23, 2020. Time-resolved dynamics may be seen as movies in Supplementary Information: mexico_parameter.mp4

The diffusivities of the corresponding sub-populations of the inferred model for Mexico are shown in Fig. 6. Similar to Michigan, the mobility of infected and recovered sub-populations are higher around the highly infected Mexico City. Mexico is about 3000 km long from north to south and 1900 km wide from east to west–scales that can help place the diffusivities in Fig. 6 in perspective. Unlike the case in Michigan where there were multiple bursts in different cities, the mobilities of all sub-populations became stable after about 5 months from March 23, 2020. This may reflect differences in the proclivity toward domestic/local mobility of the populations of Michigan and Mexico–two regions with strongly contrasting social, economic and cultural characteristics.

Fig. 6
figure 6

Inferred diffusivities of the susceptible, infected, and recovered sub-populations at day 281 at Mexico starting from lockdown, March 23, 2020. Mexico is about 3000 km long from north to south and 1900 km wide from east to west–scales that can help place the diffusivities in perspective. The time-resolved dynamics may be seen as movies in Supplementary Information: mexico_parameter.mp4

Finally, taking the inferred parameters on the last day used for inference (Day 281), we predicted the evolution of sub-populations for three weeks (Days 282 to 303) using the inferred model. Figs. 7 and 8 show the predicted spatio-temporal evolution of the infected population against the raw data for both Michigan and Mexico. The inferred models captured closely the evolution of the infected-sub-populations, indicating that the dynamics of the disease tended to be steady in January 2021. The prediction of recovered and deceased sub-populations are shown in Figs. 9 and 10 under "Appendix".

Fig. 7
figure 7

Prediction of infected population for 3 weeks against the data for Michigan

Fig. 8
figure 8

Prediction of infected population for 3 weeks against the data for Mexico. The inset adjacent to the map of Mexico is a magnified view of the region around Mexico City

6 Conclusion

This communication builds upon our previous work [11] on system inference and machine learning from data to study the progression of COVID-19 across the state of Michigan. We extended the PDE SIRD model by allowing the infection rate, death rate and the recovery rate, as well as the diffusivities of the susceptible, infected, and recovered sub-populations to vary over space and time. Using field inversion to infer the parameters as finite-dimensional fields on time scales of a single day, we obtained models to predict the evolution of disease with high accuracy. This provides us with the ability to analyze the dynamics of the disease through the inferred parameters, and make accurate predictions within a reasonable time frame. Particularly, we can detect the evidence of time and spatially varying mobility of the population through the simple diffusion-reaction models instead of the relying on the agent-based models which require individual’s mobility data. The latter can prove challenging, technically as well as politically, to obtain.

As discussed in Sect. 5, our inferred models capture the geographical spread of infection, the number of deaths and the size of the recovered population starting from one highly infected area to its surrounding cities and eventually spreading to further areas. Particularly, the higher infection and death rates in areas with low infection at later times suggests that more attention is needed in such locations. This may be due to a lack of medical services, or a lack of compliance with mitigation strategies. Our inferred models also reveal higher mobility surrounding the highly infected areas suggesting the importance of quarantine and social distancing.

The spread of COVID-19 has exhibited large variations in space and time, and the data has shown that its reproduction is very dependent upon each regional population: its population densities, culture and political environments (e.g. compliance with government mandates, resources, etc.) Our model introduces seven spatio-temporal parameters that, although they can lead to overfitting, are needed to resolve variations and make accurate and specific population predictions over short times of the order of two weeks. In such settings, health policy makers can make decisions and issue mandates by relying on two week predictions in their specific populations. This is what our model was able to achieve.

Finite-dimensional representation allows the parameters to accurately capture the spatial dependence, however the non-parametric representation makes the projection of these parameters beyond the data range extremely challenging. A prediction cannot be made with confidence if the dynamics of the disease reflected by these parameters are not stable. Of course, extrapolation is challenging in almost all data-driven methods. One possible alternate is to develop surrogate models of these parameters via time dependent neural networks under the constraints of the SIRD model to learn the spatial variation in time, and thus to make reasonable prediction of the dynamics in the evolution the disease, such as we have demonstrated previously [11]. Nevertheless, without including factors such as mobility restrictions or other mandates, only short time predictions may be accurate.