Assessing the Spatio-temporal Spread of COVID-19 via Compartmental Models with Diffusion in Italy, USA, and Brazil

The outbreak of COVID-19 in 2020 has led to a surge in interest in the mathematical modeling of infectious diseases. Such models are usually defined as compartmental models, in which the population under study is divided into compartments based on qualitative characteristics, with different assumptions about the nature and rate of transfer across compartments. Though most commonly formulated as ordinary differential equation models, in which the compartments depend only on time, recent works have also focused on partial differential equation (PDE) models, incorporating the variation of an epidemic in space. Such research on PDE models within a Susceptible, Infected, Exposed, Recovered, and Deceased framework has led to promising results in reproducing COVID-19 contagion dynamics. In this paper, we assess the robustness of this modeling framework by considering different geometries over more extended periods than in other similar studies. We first validate our code by reproducing previously shown results for Lombardy, Italy. We then focus on the U.S. state of Georgia and on the Brazilian state of Rio de Janeiro, one of the most impacted areas in the world. Our results show good agreement with real-world epidemiological data in both time and space for all regions across major areas and across three different continents, suggesting that the modeling approach is both valid and robust.


Introduction
The COVID-19 pandemic has caused an extraordinary global disruption, both in terms of human lives and economic damage. To limit the spread of contagion, governments have taken unprecedented measures. Such actions have included, but not been limited to, quarantines, curfews, lockdowns, and domestic and international travel suspensions. While undoubtedly effective and considered necessary by many experts, these measures also carry a high cost in terms of economic impact and mental and physical health, among others. In part, such restrictions are motivated by the lack of reliable data and models on this disease's transmission in time and space, forcing cautious (and not always entirely rational) responses from the authorities and population. More than ever, these events demonstrate the need for tools to predict the spatio-temporal dynamics of contagion. It is important to highlight that such tools should be not only accurate, but, given the difference in the quality of available data in distinct regions of the world (or even within the same country), they should also be robust with respect to the data used for parameter identification. In this context, model validation -also against data possibly coming from different countries -should play an important role.
Such models typically follow a compartmental framework, in which the population under study is divided into compartments with assumptions about the nature and time rate of transfer from one compartment to another [6,41]. One of the simplest compartmental models is the SIR model proposed in 1927 by Kermack and McKendrick [23], where the population is divided into susceptible, infected, and recovered compartments. This basic SIR model can be extended in several ways by enriching the number of compartments, as, among others: the susceptible-infectioussusceptible (SIS) model for the common cold, in which infectious people become susceptible again once recovered; the susceptible-infected-recovered-deceased (SIRD), which distinguishes recovered from dead; the susceptible-exposedinfectious-recovered (SEIR) model, where infected carriers experience an exposed period before they become infectious. Several features may be included such as maternal immunity, vaccinations, effects of age, variable contact rates, quarantine measures [43], etc. The SIR based models also may be used along with other approaches as a mobility network model [9] or an agent-based computational framework [45].
The large majority of the proposed compartmental models are composed of a system of ordinary differential equations (ODEs) in time. Though such ODE models are simple to formulate, analyze, and solve numerically, they do not naturally account for individuals' movements from one region to another. To incorporate spatial variations, one may define regional compartments corresponding to different geographic areas and introduce coupling terms to account for movements across regions. Such approaches have also been applied to COVID-19 in, for example, [1,15,17]. ODE models may be used in combination with other approaches as well, such as with a neural network [27] or an agentbased computational framework [16,37,45]. While all these approaches are effective, they lack the ability to model spatial transmission at the continuous level, which may have some advantages.
To this end, one may instead resort to a compartment model based on partial differential equations (PDEs). Such models are decidedly less common than ODE ones, in particular, due to their increased difficulty and time involved in both implementation and numerical solution. However, such an additional cost is counterbalanced by the fact that PDE models naturally (and accurately) incorporate spatial information, allowing for a continuous description of spatio-temporal dynamics, with the potential to account for geographical features, population heterogeneity, and multiscale dynamics with relative ease. Indeed, recent research indicates that COVID-19 spreading presents multi-scale features that go from the virus and individual immune system scale to the collective behavior of a whole population [4]. For the scale of virus transmission among individuals, there are studies such as the potential infection zone produced by a cough [46] or the viral propagation in a built environment [29]. On a smaller scale, it is possible even to study the effects of the viral decontamination efficacy with UV irradiation [47]. On the other hand, it is also possible to simulate a global planetary scale pandemic as in [45]. Here, we are interested in studying the dynamics of the virus spread in specific geographic regions.
While quite uncommon, PDE models have been already used with success in the past to model epidemics [7,22,24]. For the specific case of COVID-19, PDE models were applied in [19,20,41,42], in which a compartmental SEIRD model (susceptible, exposed, infected, recovered, deceased) was used. A modified version of this model, with the addition of a quarantined compartment, was applied in [5]. Such modeling framework has shown promising results, being able to recreate observed data and make predictions with reasonable accuracy, but these successful applications have been so far obtained on a limited scale, with results restricted to single, specific regions. Therefore, despite the clearly high potential shown by this modeling approach, important questions remain regarding its robustness. In particular, it is still to be proven the model capability to guarantee reliable results for a wide range of areas, which may exhibit significantly different geographical, behavioral, and demographic characteristics. Furthermore, even if the model may demonstrate the ability to represent different regions accurately, it is unclear to what extent parameters must be specifically tuned to obtain accurate results for a given case. In fact, ideally, a robust model should be applicable to a wide range of settings, requiring relatively little specific parameter tuning to produce acceptable results for a given case.
Within this context, the current work aims at evaluating the robustness of the model used in [19,41,42] by studying in detail its application to a series of different cases. In order to accomplish this task, the model is applied to the following three cases: The Italian region of Lombardy, the U.S. state of Georgia, and the Brazilian state of Rio de Janeiro. These are not only the homelands of the authors, but they do represent regions of three distinct continents with very different features in terms of culture, population, and mobility, as well as government response to the pandemic. To deeply test its robustness to different region-specific characteristics, the model is applied across each case with limited parameter tuning and case-specific adjustments, and it is shown to be capable of producing consistently reliable results, in particular for predictions relative to short to mid-time spans, which are relevant for timely political decisions.
The remainder of this work is organized as follows: Sect. 2 introduces the governing equations and the relevant parameters, as well as the adopted terminology and notation. This section is also used to briefly discuss considerations regarding the numerical solution of the associated discrete problem. In the subsequent sections, the model is then tested on the cases of Lombardy, Georgia, and Rio de Janeiro, and for each case extensive discussion and analysis of results are provided. Finally, the paper is concluded by a summary of the main findings and suggestions for future research directions.

Governing Equations
We present the governing equations following the continuum mechanics framework first shown in [42], as opposed to more traditional notations common in mathematical and biological references. We consider a system that may be decomposed into N distinct populations: u 1 ( , t) , u 2 ( , t) , …, u N ( , t) . Let ∈ R 2 be a simply connected domain of interest and denote its boundary as = D ∩ N . Let [0, T] denote a generic time interval. The transient nonlinear diffusion-reaction system of equations written in compact vector notation then reads as follows: where s( , t) , e( , t) , i( , t) , r( , t) , and d( , t) denote the densities of the susceptible, exposed, infected, recovered, and deceased populations, respectively. We additionally denote the cumulative number of infected as c( , t) and the sum of the living population as n( , t) ; i.e., n( , t) = s( , t) + e( , t) + i( , t) + r( , t) . We then use the compact vector notation to represent these quantities as: The tensors , and , and the vector are defined according to the particular dynamics of the system in question. In most applications, = ( , ) , that is, diffusion is time-dependent, heterogeneous, and anisotropic; however, this is not strictly necessary. In addition to the boundary conditions (2), (3), we must also specify an initial condition ( , 0) = 0 . We define the total population U i (t) of a given compartment u i ( , t) as: To adequately define a model of COVID-19 contagion, we make the following assumptions, as discussed in [19,42]: -We consider only mortality due to COVID-19; -We do not consider new births; -We assume that a portion of exposed individuals will never develop symptoms (asymptomatic cases), and hence will move directly from the exposed compartment to the recovered compartment; -We assume that both pre-symptomatic and asymptomatic (the exposed compartment), as well as symptomatic (the infected compartment) individuals may spread the disease; -There is a latency period after exposure and before the development of symptoms; -Movement is proportional to the population size; i.e., we expect greater diffusion to occur in heavily populated regions; -There is no movement among the deceased population.
Under the above assumptions, the frequency-dependent system of equations reads: where i and e denote the contact rates between symptomatic and susceptible individuals and asymptomatic and susceptible individuals, respectively (units days −1 ), denotes the incubation period (units days −1 ), e corresponds to the asymptomatic recovery rate (units days −1 ), i the symptomatic recovery rate (units days −1 ), represents the mortality rate (units days −1 ), and s , e , i , r are the diffusion parameters of the different population groups as denoted by the subscripted letters (units km 2 persons −1 days −1 ). Note that all these parameters can be considered time and space-dependent to better represent what happens in real situations, where the epidemiology changes as the public health guidelines, lockdowns, and health response evolve. Therefore, it is possible to adjust the contact rate and diffusion parameters for each period and location. Connecting the COVID-19 available data to emerging technologies, like physics informed neural networks [35], data-driven inference techniques [44], or Bayesian calibration [20] can help to get insight into the relevant parameters and their spatio-temporal characteristics.
We may express the model (19)- (23) in the general form given by equation (1) by defining the tensors , , , and the vector in the following way: Assuming additionally that the region of interest is isolated, we then prescribe the following homogeneous Neumann boundary conditions: or simply ( ⋅ ∇ ) ⋅ = 0 . We acknowledge that in certain areas where traffic across regions is important, this assumption may not be completely realistic and other alternatives (such as a coupled ODE/PDE model on the boundary) may be more suitable. Such a consideration was mentioned briefly in [41]. We note that one may also consider the following model related to (5)-(9): The model (19)-(23) is known as the density-dependent formulation and was studied in [41,42]. The difference between the models can be seen in equations (5)-(9) and (19)-(23). In (5)-(9), the contact terms are normalized by the living population, whereas such a normalization does not occur in (19)- (23). In the frequency-dependent formulation (5)-(9), this normalization implies that the contagion is independent of population density, while this is not the case in the density-dependent formulation, as the name may suggest [32,38]. Both models may be valid and deliver accurate results, depending on the physical situation, and we will show computations performed with both formulations in the present work.
The other major difference between (5)- (9) and (19)-(23) is the presence of the Allee effect A. This term has been used extensively in other settings, with the form used above inspired directly by applications in cancer modeling [11,21]. The Allee term serves to reduce transmission in areas where the population density is below a certain threshold A, by bringing the population in the exposed compartment to the susceptible compartment in such regions. Consequently, in those areas, the population in compartments e and i tends to zero, which eventually cancels out the transfer term where the Allee term participates. In [41], this term served to accentuate the contagion in major urban regions while reducing it in less-populated areas, consistent with the observed physics. We briefly note also that as s, i, and e are all less than n by definition, we do not expect blowup of this term, even for very small n.
For the numerical solution of (5)-(9), (19)-(23), we discretize in space using a Galerkin finite element formulation. The resulting systems of equations are stiff, leading us to employ implicit methods for time integration. We apply the second-order Backward Differentiation Formula (BDF2) in all cases, which offers second-order accuracy while remaining unconditionally stable. We additionally make use of an adaptive mesh refinement and coarsening strategy (AMR/C), allowing us to resolve multiple scales. One may find more details about the adopted methods in [19,42]. All simulations have been performed using the libMesh, a C++ FEM open-source software library for parallel adaptive finite element applications [25]. libMesh also interfaces with external solver packages like PETSc [3] and Trilinos [39]. It is an excellent tool for programming the finite element method and can be used for one-, two-, and three-dimensional steady and transient simulations. This library provides native support for AMR/C, thus providing a natural environment for the present study. The main advantage of libMesh is the possibility of focusing on implementing the specific features of the modeling without worrying about adaptivity and code parallelization. Consequently, the effort to build a high performance computing code tends to be minimized.

Simulation of the COVID-19 Virus Spread in Some of the World's Hot-spots
In the following section, we demonstrate the robustness of the models (5)-(9), (19)-(23) by simulating the COVID-19 outbreak for three different geographic regions: Lombardy (Italy), the U.S. state of Georgia, and the state of Rio de Janeiro (Brazil). These regions cover three different continents, and each exhibits very different geographical features, population characteristics, economic, social, cultural factors, and government responses. By achieving good results for each case, we aim at showing that the modeling approach shown here is robust to general cases and does not require extensive modification or specific parameter tuning to achieve reasonable results for a given case. The three cases shown were chosen for a number of reasons. Notably, they represent the authors' home regions, so there is an obvious interest from this point-of-view. Beyond this point, however, the regions also have specific characteristics that make them remarkable case studies. Italy is one of the hardest-hit regions in Europe, and Lombardy represented the epicenter of the outbreak in Italy, as well as the first area in a Western country to exhibit community spread. During the first wave of the epidemic in Spring 2020, Lombardy was disproportionately impacted compared to other regions in Italy, in which the epidemic was less severe. It remains the hardest-hit region in Italy, even after the second wave of infections has been less geographically restricted. As Lombardy was studied specifically in [41,42], it also represents a natural choice for validating the new code and the adaptivemeshing strategy implemented in the present work.
The U.S. state of Georgia is a logical choice for several reasons. The state has 159 counties, second to only Texas among U.S. states. However, the average county size for Georgia is less than half of Texas, giving the state a good county-by-county spatial resolution, and allowing us to evaluate the spatial accuracy of the model in reasonable detail. Further, the state was the first among U.S. states to reopen in late April, and has since taken a large number of policy measures, in both relaxing and increasing restrictions, providing us a large number of decisions to incorporate and analyze in our model.
Rio de Janeiro is also a sensible choice for many of the same reasons as Georgia. At the time of writing, Brazil is the third-most severely impacted country in the world in terms of total cases, and the second one in terms of total deaths. Rio de Janeiro has the largest number of deaths per inhabitant within Brazil, making its simulation worthwhile on severity alone. Aside from this aspect, the area also exhibits interesting geographical features, notably its natural harbor and large population centers on either side. Additionally, there is good availability of mobility data, which we have incorporated into the diffusion parameters. The use of such data is logical for a model of this type, and its consideration represents an interesting and important component for this modeling approach. We note that, given the large differences in population distributions, densities, and political responses among the different regions, uniform values could not be used. Instead, the values were selected according to the unique local dynamics of each case.

Lombardy, Italy
In our first numerical test, we seek to reproduce, and possibly improve, the simulations of Lombardy, Italy, as shown in [41,42]. This model was already shown in [41] to predict the first wave of the COVID-19 outbreak in Lombardy with good accuracy. We note that for this simulation we use the density-dependent model with the Allee term (19)- (23). While using the same parameters as shown in [41,42], we incorporate two major improvements. First, we employ the adaptive mesh refinement and coarsening strategy shown in [19]. Second, rather than using Backward-Euler timestepping, we utilize a BDF2 scheme, as [18,42] suggest that such a scheme is better suited to the system of equations in question. Thus, the primary motivation of this simulation is the validation of the current code and algorithmic framework.
The AMR/C procedure uses a local error estimator to drive the refinement and coarsening procedure, considering the error of an element relative to its neighbor elements in the mesh. The elements are flagged depending on a refining ( r f ) and a coarsening ( c f ) fraction. The refinement level is limited by a maximum h-level ( h max ). In this case, for the AMR/C procedure, we set h max = 1 , r f = 0.95 , c f = 0.01 . We apply the adaptive mesh refinement every 4 time-steps. The original mesh has 10,987 linear triangular elements, and, after refinement, the minimum spatial resolution is about 1 kilometer. We initially refine the domain in one level, and the time-step is defined as t = 0.25 days.
In Fig. 1 we compare the results obtained here with those from [42]. We observe the same qualitative dynamics; the epidemic follows the same path, moving from the southern areas of the region northward into the major metropolitan areas, and the Milan area in particular. The changes to the meshing and time-stepping schemes result in somewhat lower contagion, particularly in the areas in and around the city of Milan in the western part of the region. Examining the progression of the mesh adaptation in Fig. 1, we see that the AMR/C algorithm successfully follows the transmission path, changing in time to adapt to the evolving geography of the epidemic in time. We briefly note that when using the current code with Backward-Euler and a fixed mesh (not shown), the results were identical to those obtained in [41]; thus, we can conclude that the differences between the solution depicted in the present work and that shown in [41] represent improvements resulting from the differences in meshing and time-stepping used in the present one. From this, we conclude that the current code and algorithmic framework is validated.

Georgia, USA
From now on we use the frequency-dependent model (5)-(9) together with adaptive mesh refinement and the BDF2 timestepping scheme. In the next test, we aim at reproducing the COVID-19 outbreak in the U.S. state of Georgia.

Model Construction
We have first obtained the map of the state of Georgia (GA) along with the county boundaries in shapefile format from https:// arc-garc. opend ata. arcgis. com/ datas ets/ dc207 13282 734a7 3abe9 90995 de404 97_ 68. To triangulate the GA region, we follow the steps given by [20]: -Load the GA map file in the freely available QGIS software [34]; -Coarse grain the outer boundary segments using the Simplify tool in QGIS. The original map has some very small length segments, which may create problems in triangulation or result in a very fine mesh; -Obtain the vertices using the Extract Vertices tool in QGIS and save the vertices layer using the save layer as option. Select As XY in the Graphical category and save the file in a .csv format; Fig. 1 Spread of the COVID-19 virus spread at Lombardy, Italy. Comparison between the results obtained in [41], in which the time integration was done using the Backward-Euler method with a fixed mesh, and the present work, in which we use the BDF2 method and AMR -Prepare a Gmsh input file using the vertices' file for triangulation.
In Fig. 2, we show the generated grid for the state of Georgia. The original mesh has 32,056 linear triangular elements, and after refinement, the minimum spatial resolution is about 2 kilometers. We initially refine the domain in one level. For the AMR/C procedure, we set h max = 1 , r f = 0.95 , c f = 0.01 . We apply the adaptive mesh refinement every 4 time-steps. The time-step is defined as t = 0.25 days. We define the initial infected population accordingly to the data provided by the Johns Hopkins University [12]. We define the beginning of the simulation on 25 March 2020 and simulate 250 days. The exposed population is more difficult to estimate, as they consist of asymptomatic and presymptomatic individuals. Here, we consider 10 times the number of infected [36]. The susceptible population is based on the estimation of the population of each county, given by the U.S. Census Bureau [40]. The populations are divided by the area of each county and distributed on the 159 areas of the mesh as people∕km 2 . In Figs. 3, 4, and 5 we show the initial conditions.
The parameters of the simulation are defined as: = 1∕8 day −1 , i = 1∕24 day −1 , e = 1∕6 day −1 , = 1∕160 day −1 . These values are based on available data from the literature regarding the mortality, incubation period, and recovery time for infected and asymptomatic patients [41].
The contact rate and diffusion are estimated based on the policies adopted in the Georgia state during the period of the simulation (https:// coron avirus. jhu. edu/ data/ state-timel ine/ new-deaths/ georg ia/). This leads to the time-dependent values as shown in Fig. 6.

Results
To compare our model results with the data available, we compare the number of deaths caused by the COVID-19. First, we show the values of the whole state. Then, we integrate the values of some counties and plot them in time (Figs. 7, 8). Figure 9 also shows some snapshots of the simulation with the populations and mesh at different time-steps.
Referring to the entire state in Fig. 10, we observe that the model accurately captures both qualitative and quantitative  dynamics of the epidemic with good accuracy. The difference in relative L 2 norm between the measured and simulated deaths is 5.75%, confirming the agreement. We note that a small discrepancy between the simulated and measured data occurs at the end of the simulated period; this is caused by a series of deaths being added retroactively to the data, with the actual date of these deaths being unknown.
Turning our attention to the individual counties, we see more variation in the results. We start by noting that the initial dynamics (first 60 days) is predicted well nearly everywhere, with a divergence between the simulated and measured deaths becoming more apparent as the epidemic progresses. This divergence generally indicates that such models' predictive power may decrease in time, as one may reasonably expect. Nonetheless, we do observe encouraging results over the entire simulated time frame, particularly in certain areas. We first assess counties comprising the Atlanta metropolitan area. The heavily populated counties of Dekalb, Gwinett, Clayton, and Cobb show very good agreement with the simulated data. The most populous county in Georgia, Fulton County, is over-predicted. On the fringe of the Atlanta metropolitan area, Forsyth, Clarke, and Cherokee counties were also over-predicted.
Outside of the Atlanta area, we found a general underestimation of the outbreak severity in the Bibb, Richmond, and Chatham counties. As the "first wave" of infections is well-simulated, this likely represents a shortcoming of the diffusive model. In particular, after the initial re-openings, additional source terms may need to be included in certain areas to model the effects of travelers arriving from elsewhere, as our model does not account for nonlocal movements. In contrast, Dougherty County, which was heavily impacted early in the epidemic, shows the opposite trend, and the outbreak during the later stages of the simulated period is less severe than as predicted by the model. This shows that the model is sensitive to the initial conditions, and spatially-variable parameters, in particular contact rates, may be needed to account for such phenomena properly.

Rio de Janeiro, Brazil
The last simulation aims at reproducing the COVID-19 outbreak in the state of Rio de Janeiro, Brazil. Rio de Janeiro has the second number of deaths and infected in Brazil, but the largest number of deaths/inhabitant as per January 2021 1

Model Construction
We have obtained the map of the state of Rio de Janeiro (RJ) along with the county boundaries in shapefile format from ftp:// geoftp. ibge. gov. br/ organ izacao_ do_ terri torio/ malhas_ terri toria is/ malhas_ munic ipais/ munic ipio_ 2017/ UFs/. To triangulate the RJ region, we follow the same steps explained in the previous section.
In Fig. 11, we show the grid generated for the state of Rio de Janeiro. We eliminated some islands to facilitate the simulation. The original mesh has 11,632 linear triangular elements, and after refinement, the minimum spatial resolution is about 500 meters. We initially refine the domain in one level. For the AMR/C procedure, we set h max = 1 , r f = 0.95 , c f = 0.01 . We apply the adaptive mesh refinement every 4 time-steps. The time-step is defined as t = 0.25 days.
We define the initial infected population accordingly to the data provided by the Brazilian Ministry of Health, collected by https:// covid 19br. wcota. me/. We define the beginning of the simulation on 25 March 2020 and simulate 180 days. However, since there is a delay until people develop symptoms, another delay in receiving the test results, and another one for the case being disclosed, we consider that the initial infected population is equal to the case numbers provided for each county on 1 April 2020. As in the case of Georgia, we initialize the exposed population as 10 times the number of infected [36]. The susceptible population is based on the estimation of the population of each county, given by the Brazilian Institute for Geography and Statistics 2 . The populations are divided by the area of each county and distributed on the 92 areas of the mesh as people∕km 2 .
In Figs. 12, 13, and 14 we show the initial conditions. The population of the state of Rio de Janeiro is concentrated near the capital and the metropolitan region.  The parameters of the simulation are defined as: These values are based on available data from the literature regarding the mortality, incubation period, and recovery time for infected and asymptomatic patients [41].
The contact rate is estimated based on the reproduction number estimation given in https:// perone. github. io/ covid 19ana lysis/. We define e = i = 0.215 days −1 , and we multiply this value by the time function given by Fig. 15b, which is a simplification of the behavior of the reproduction number between 25 March 2020 and 21 September 2020. The diffusion coefficient is based on the social distancing estimation given by https:// coron avirus. ufrj. br/ covid imetro/, which presents the perception of average weekly social isolation based on the movement of cell phones in the state of RJ. The diffusion behaves opposite to the social isolation, i.e., when one increases, the other one decreases. Therefore, we set s = e = r = 1 × 10 −3 and i = 1 × 10 −5 km 2 persons −1 days − 1 , and scale these values by a timedependent function (Fig. 15a) that tries to represent the social isolation during the 180 days of simulation. These values reflect that, as we include a population weighting in the diffusion, we must take take into account not only spatial dispersion, but also population concentrations within the given regions (these are reflected in the units).

Results
To compare the results of our model with the data available, we compare the number of deaths caused by the COVID-19. First, we show the values of the whole state (Fig. 16). Then, we integrate the values of some counties and plot them in time (Figs. 17 and 18). Figure 19 also shows some snapshots of the simulation with the populations and mesh at different time-steps.
As in the previous case in Georgia, most selected counties show good agreement with the available data. Notably, over the entire region, the relative difference in L 2 norm between the observed and simulated fatalities is only 10.06%. Over the first several weeks, the results are in good agreement nearly everywhere, and, in the areas around the city of Rio de Janeiro, we capture the dynamics very accurately over the entire considered time interval. In addition to the county of Rio de Janeiro, we observe particularly good agreement in the highly populated areas of São Gonçalo, Duque de Caxias, Belford Roxo, and Itaborai. In the areas of Petropolis, Volta Redonda, and Nova Igauçu, we obtain somewhat  15 Functions that multiplies the diffusion (a) and contact rate (b) in time for Rio de Janeiro less accurate, but still nonetheless reasonable results. In these latter cases, we observe good agreement over the initial phases of the dynamics, with some differences later in the time interval. These deviations are unsurprising, as one may naturally expect increased difficulty further in time.
We do observe some notable instances of overestimation in Niterói and underestimation in Cabo Frio and Campos dos Goytacazes. In the latter cases, there were no initial infected or exposed population in this simulation, making diffusion the only possible way for the virus to reach the areas. Given the large distances between these areas and the hotspots near the city center of Rio de Janeiro (157 km to Cabo Frio and 279 km to Campos dos Goytacazes), as well as large sparsely-populated areas between them, populationweighted diffusion was not able to properly represent these dynamics. The model best captures spatial dynamics over relatively short distances overpopulated regions. This represents a shortcoming of the model, and additional terms, such as source/sink terms, fractional diffusion operators, or bilaplacian diffusion terms, may be required to properly account for such nonlocal dynamics. In the case of Niterói, the overprediction may result from an overestimation of the initial exposed population or failure to take into account specific local policies. This is similar to what we observed in Georgia, in which certain particular regions had dynamics that deviated from the measured data. However, as in Georgia's case, we observe very good agreement when considering the totality of the region, with particular areas less wellrepresented due to the natural limitations of our modeling approach. In these instances, such cases provide clear directions in which we may improve the modeling framework. For all of our cases, and in particular both Rio de Janeiro and Georgia, our contact and diffusion parameters were selected according to region-wide measures. However, at the local county level, many counties enacted their own policies in addition to such blanket measures. These specialized policies were not considered in the current models, and their careful inclusion would likely result in increased accuracy at the local level.

Conclusions and Future Work
In this work, we have established the robustness of the SEIRD model introduced in [41] and further extended in [19,20,42] by examining both its frequency-and densitydependent formulations and applying it to three settings over different continents: the region of Lombardy, Italy (densitydependent), the U.S. state of Georgia, and the state of Rio de Janeiro, Brazil (frequency-dependent). All these regions have very different geographic characteristics, population density patterns, policy response measures, and cultural contexts generally; despite this, the models were able to adequately reproduce the spatio-temporal contagion dynamics with minimal parameter differences and very little tuning. This provides strong evidence towards the robustness of the model, as it can produce adequate results across different settings without requiring extensive parameter fitting and learning.
The results obtained showed good qualitative agreement generally across all regions; however, they show clear room for improvement. In the cases of Georgia and Rio de Janeiro, we find that the initial conditions have a large influence, even over the long-term dynamics of the system. This is particularly apparent in the case of Dougherty County in Georgia, where an early outbreak led to a long-term overestimation of contagion, and in Campos dos Goytacazes and Cabo Frio in Rio de Janeiro, where a relative lack of early cases led to an under-prediction of the long term contagion effects. This is likely due to the diffusion formulation limitations, which cannot correctly account for non-local mobility across different areas, such as returning travelers. Such effects could be incorporated in a number of ways by including more sophisticated source/sink terms, fractional diffusion operators, or bilaplacian terms. In Lombardy, we observe an under-prediction of contagion in less-populated regions; this appears to be characteristic of the density-dependent formulation, and the frequency-dependent models applied in Georgia and Rio de Janeiro do not suffer from this problem. Across all cases, the same parameter values were used everywhere. While this represents evidence towards robustness and is in some sense a positive aspect, results could, of course, be further improved by using different definitions in different regions, something explored briefly in [41] and in more detail in [20]. Such approaches may incorporate more sophisticated methods, including optimization, machine-learning techniques, and parameter fitting techniques, such as least-squares fitting methods in time and space [44]. Incorporating more physics into our models, such as sneezing dynamics, agent-based methods, Fig. 18 Comparison between simulation and real data of deaths in RJ (per county)-part 2 and the effects of UV radiation on viral spread could also make such models more useful as predictive tools [45][46][47]. These shortcomings represent natural directions for future work in this area. Nonetheless, we believe that our results clearly establish the viability and robustness of this PDE modeling framework, suggesting this as a suitable path for future research, and ultimately may perhaps prove useful to public health decision-makers.