Detecting space–time clusters of COVID-19 in Brazil: mortality, inequality, socioeconomic vulnerability, and the relative risk of the disease in Brazilian municipalities

The first case of COVID-19 in South America occurred in Brazil on February 25, 2020. By July 20, 2020, there were 2,118,646 confirmed cases and 80,120 confirmed deaths. To assist with the development of preventive measures and targeted interventions to combat the pandemic in Brazil, we present a geographic study to detect “active” and “emerging” space–time clusters of COVID-19. We document the relationship between relative risk of COVID-19 and mortality, inequality, socioeconomic vulnerability variables. We used the prospective space–time scan statistic to detect daily COVID-19 clusters and examine the relative risk between February 25–June 7, 2020, and February 25–July 20, 2020, in 5570 Brazilian municipalities. We apply a Generalized Linear Model (GLM) to assess whether mortality rate, GINI index, and social inequality are predictors for the relative risk of each cluster. We detected 7 “active” clusters in the first time period, being one in the north, two in the northeast, two in the southeast, one in the south, and one in the capital of Brazil. In the second period, we found 9 clusters with RR > 1 located in all Brazilian regions. The results obtained through the GLM showed that there is a significant positive correlation between the predictor variables in relation to the relative risk of COVID-19. Given the presence of spatial autocorrelation in the GLM residuals, a spatial lag model was conducted that revealed that spatial effects, and both GINI index and mortality rate were strong predictors in the increase in COVID-19 relative risk in Brazil. Our research can be utilized to improve COVID-19 response and planning in all Brazilian states. The results from this study are particularly salient to public health, as they can guide targeted intervention measures, lowering the magnitude and spread of COVID-19. They can also improve resource allocation such as tests and vaccines (when available) by informing key public health officials about the highest risk areas of COVID-19.


3
Detecting space-time clusters of COVID-19 in Brazil: mortality,… 2005; Levine 2006, Robertson et al. 2010;Rogerson and Yamada 2008;Paez et al. 2020, Yamada et al. 2009), allowing to estimate the dynamics of infectious diseases. The prospective space-time scan statistic (Kulldorff 1997) is a widely used cluster detection tool in disease surveillance, which can identify areas that are statistically significant hotspots of disease incidence on the most current time period of the analysis (Allévius and Höhle 2019). The statistic determines if the space-time patterns of COVID-19 cases exhibit statistically significant clustering. Cylindrical scanning windows of different spatial and temporal dimensions are computed to systematically scan the study area and time period for more observed than expected disease cases. The prospective version of the scan statistic is slightly different than the retrospective version (Desjardins et al. 2018;Owusu et al. 2019;Whiteman et al. 2019) because it disregards historical clusters that may have previously existed before the most current day of analysis (Kulldorff 2001).
There are many examples illustrating the use of the prospective space-time scan statistic. Chen et al. (2016) designed an online analytical tool for frontline public health workers to prospectively detect ongoing dengue fever in each village of Tainan and Kaohsiung transmission on a weekly basis. Tang et al. (2017) identified seasonal peaks and high-risk periods of measles in Guangxi during 2013-2014 and found patterns of transmission in space and time. Al-Ahmadi et al. (2019) provided an initial assessment for the potential environmental risk factors of MERS-CoV in Saudi Arabia between June 2012 and March 2019, performing spatiotemporal cluster analyses proposed by Kulldorff's spatial scan statistics on cases reported in that period. It was the first study that aims to analyze the spatiotemporal pattern and clustering of MERS-CoV in Saudi Arabia, and the results reinforce that secondary infections are the great challenge for health-care system in the prevention and control of MERS-CoV outbreaks in Saudi Arabia. The prospective scan statistic has recently been used in a series of studies on COVID-19. Desjardins et al. (2020), Hohl et al. (2020a) and Hohl et al. (2020b) identified COVID-19 clusters and estimated relative risk throughout the USA at the county level. Masrur et al. (2020) conducted spatiotemporal analysis using the prospective scanning statistic in Bangladesh, suggesting that the country had experienced a community-level transmission as early as March 2020. Alkhamis et al. (2020) and Gomes et al. (2020) used the same approach to identify clustering events that were still active (i.e., emerging clusters) in the State of Kuwait and Northern Brazil, respectively. The results from these studies are particularly salient to public health, as they can guide targeted intervention measures, lowering the magnitude and transmission of COVID-19. They can also improve resource allocation and justify continued social distancing and stay-at-home orders by informing key public health officials about the highest risk areas of COVID-19. The importance of the prospective approach is that it can be extended to analyze the characteristics of the population of municipalities within the clusters. As case data are updated, the analysis can be repeated to continuously monitor the evolution of COVID-19 outbreaks (Desjardins et al. 2020).
In Brazil, the State Health Secretariats (SHS) update the data daily and make them public, so our approach is well suited to facilitate daily COVID-19 surveillance in the country. The Ministry of Health reports daily confirmed cases and 1 3 deaths; while also utilizing a COVID-19 app to disseminate information (de Oliveira et al. 2020). Regarding COVID-19 surveillance in Brazil, a susceptible, exposed, infected, removed (SEIR) model was applied to several lockdown scenarios (Tarrataca et al. 2020); while Ribeiro and Bernardes (2020) estimated the number of underreported cases and deaths in Brazil. Some studies were conducted in Brazil about the risk of COVID-19 transmission and health-care system capacity. Costa et al. (2020) used demographic and mobility data and COVID-19 cases along 3 weeks since March 31st and performed a long-term analysis of epidemic outcomes using a stochastic metapopulation model. The authors found that the degree of heterogeneity and desynchronization of the epidemic curves in cities with large populations and countryside regions suggest diverse mitigation scenarios and strategies to combat COVID-19. Castro et al. (2020) simulated the time it would take for hospitals to operate beyond their capacity in Brazil. According to the onset and the intensity of transmission, shortages of hospital beds, intensive care unit (ICU) beds, and ventilators could affect populations that depend on public health systems, and this highlights issues with equity and ethics in service allocation. Considering cases of COVID-19 in March 2020, Coelho et al. (2020) calculated the probability of COVID-19 spread from São Paulo and Rio de Janeiro, considering human mobility. In addition, they evaluated socioeconomic indices to identify vulnerable areas and concluded that North and Northeast Brazil are high risk and vulnerable to adverse health outcomes.
Utilizing a prospective space-time scan statistic, our objective is to detect new emerging clusters of COVID-19 across 5570 Brazilian municipalities, contrasting two temporal intervals, from February 25 to June 7, 2020, and from February 25 to July 20, 2020. When examining these time periods, we compute the evolution of the relative risk of the clusters in different regions and municipalities in Brazil and find associations with mortality rate, vulnerability, and social inequality.

COVID-19 data and geographic information
Brazil is comprised of 5570 municipalities in 26 states. With a population of approximately 210,147,125 people (IBGE 2019), Brazil is the sixth largest country in population and fifth in landmass, which faces great inequalities and socioeconomic disparities. East coast states include approximately 70% of the population. The states of São Paulo and Rio de Janeiro have the highest population density (similar to Europe) while the states of the Amazon region have densities close to those of Canada and Australia (Somain 2014).
COVID-19 cases were retrieved from the Brazil.io project website (Brazil IO, 2020). This project compiles data from the daily COVID-19 case reports by municipality in the 27 units of Brazil and are available in a raw format, which were then tabulated to a format SaTScan could support. The data are from February 25, 2020, to July 20, 2020. In Brazil, 2,075,657 cases of COVID-19 were confirmed between the two aforementioned time periods (Fig. 1).
In GIS, two layers of geographic data were utilized: (1) the location of 5570 municipal seats for the SaTScan clusters detection and (2) municipality polygons for visualizing clustering and relative risk results via choropleth maps. These layers were retrieved by the Instituto Brasileiro de Geografia e Estatística (IBGE) website; English-The Brazilian Institute of Geography and Statistics. Using the location of the municipal seat instead of the centroid of the municipality better reflects the population concentration. The data sources are shown in appendix Table 11.

Space-time cluster detection
The detection of active clusters is conducted using the prospective Poisson space-time scan statistic method (Kulldorff 2001). The statistic detects active clusters of disease on the most current day of analysis (Jones et al. 2006). New data can be added to monitor active and emerging clusters and identify areas that no longer are experiencing excess incidence (e.g., less observed than expected cases). The statistic systematically implements moving cylinders to scan the study area, which are centered on the centroid of the Brazilian municipalities. The base of the cylinder is the spatial scanning window, and the height represents the temporal scanning window, which are both expanded until a maximum threshold is reached. The null hypothesis stipulates that the model reflects an inhomogeneous Poisson process with an intensity μ, which is proportional to the at-risk population. The alternative hypothesis is that the number of reported cases exceeds the number of expected cases derived from the null model. A maximum likelihood ratio test is utilized to evaluate the null and alternative hypotheses, which is defined in Eq. 1: where L(Z) is the likelihood function for cylinder and ZeL 0 is the likelihood function for the null hypotheses for cylinder Z. Essentially, the number of observed disease cases in a cylinder n Z is divided by the number of expected cases in a cylinder (Z) to the power of the observed n Z , multiplied by the observed cases divided by the expected cases outside of the cylinder. The numerator is then divided by the quotient of dividing the total number of observed cases for the entire study area N across all time periods (A) , to the power of the total number of observed cases. The cylinder will have an elevated risk if the likelihood ratio is greater than 1 (i.e., ). Furthermore, the cylinder with the highest likelihood ratio value is the most likely cluster.
The majority of the literature pertaining to STSS only report the locations that belong to a significant space-time cluster. However, this approach assumes that the risk of infection is homogenous throughout the cluster. Conversely, some locations within a cluster may contain zero cases of a particular disease, due to the scanning nature of the STSS. To reduce uncertainty by identifying the municipalities that are the highest risk locations in a cluster (rather than assuming the risk of disease is homogenous throughout a cluster), we also report the relative risk of each areal unit belonging to a space-time cluster, which can provide additional evidence for targeted interventions. Relative risk quantifies the risk of becoming infected with a disease in one location compared to all other locations (Eq. 2): We defined the maximum spatial and temporal search windows to 10% of the population at-risk and 50% of the study period, respectively. Each cluster's duration is set to a minimum of 2 days and a cluster must contain a minimum of five confirmed cases of COVID-19 (Desjardins et al. 2020). We utilize a prospective Poisson model to detect space-time clusters that are still occurring or active on June 20, 2020, and July 20, 2020 (Kulldorff 2001;Desjardins et al. 2020). We assume that COVID-19 cases follow a Poisson distribution under the null hypothesis that states that the model reflects a constant risk. The alternative hypothesis states that the number of observed cases exceed the number of expected cases derived from the null model. The expected cases are estimated by multiplying the population in the cylinder by the total COVID-19 rate in each cylinder. A maximum likelihood ratio test is implemented to evaluate whether cylinders have an elevated risk of COVID-19. If the cylinder has a likelihood ratio > 1, then it has an elevated risk-(i.e., case rate within the cylinder is greater than the case rate outside of the cylinder, that is, all municipalities in Brazil). To derive statistical significance, 999 Monte Carlo simulations are computed for each cylinder. We report clusters at the p < 0.05 level and map the relative risk of each municipality. The relative risk is defined as the estimated risk of COVID-19 within a municipality divided by the risk outside of the municipality. We utilize the SaTScan software for space-time cluster detection of COVID-19 data and a commercial GIS software for the visualization of clusters and relative risk of the Brazilian municipalities.

Non-spatial modeling
To examine the role of socioeconomic characteristics on the presence of COVID-19 clusters, we select three indicators reflecting population characteristics and COVID-19 mortality: the GINI index, (IPEA 2015) the Brazilian Social Vulnerability Index (SVI) (Atlas Brasil 2013), and COVID-19 mortality rate. The GINI coefficient has been applied in the area of health to measure disparities (Han et al. 2016) and is based on population income per municipality and ranges between 0 in the case of perfect equality and 1 in the case of perfect inequality. The SVI is an index that varies between 0 and 1 and summarizes three attributes: urban infrastructure, human capital, and income and labor. The closer to 1, the greater the social vulnerability of a municipality. These dimensions correspond to sets of variables that indicate that the standard of living of families is low, suggesting non-access and nonobservance of social variables. For municipalities with an SVI between 0 and 0.200, this indicates very low social vulnerability; between 0.201 and 0.300 indicates low social vulnerability; between 0.301 and 0.400 indicates middle social vulnerability; between 0.401 and 0.500 indicates high social vulnerability; and between 0.501 and 1 indicates that the municipality has very high social vulnerability (Brazil 2015). The mortality rate was selected because it is a criticality indicator since it is influenced by the structure of the population, sex, and age, in turn, conditioned by socioeconomic factors.
To analyze the correlation between the RR and the selected independent variables, we used the RR value of each municipality located in the space-time clusters, from February 25, 2020, to July 20, 2020 (n = 3304). We evaluate the effect of socioeconomic variables and mortality rate on RR using a Generalized Linear Model (GLM) (Eq 3).
with Y the relative risk, β the regression coefficients, "a" reflecting the GINI variable, "b" the SVI variable, "c" the mortality rate, and ε the error of the terms. Descriptive statistics for the variables used in the GLM model are provided in appendix Table 12. The GLM technique is conducted in the R software (version 4.0.1.). (3)

Spatial modeling
We implement a Moran's I test on the GLM residuals to detect the presence of spatial autocorrelation (Anselin 1988;Anselin and Bera 1998) and justify the use of the subsequent spatial modeling. First, we conduct a Spatial Lag Model (Eq. 4) to estimate how the dependent variable Y at in a municipality i is affected by its neighboring municipalities j.
where Y denotes the vector of the response variables, Xβ is the dimensionality of the vector parameter for the variables considered (GINI values, SVI values and mortality rate), λ is the autoregressive spatial coefficient (when λ = 0 the autocorrelation is null), WY expresses the spatial dependence Y. Second, we test the Spatial Error Model (Eq. 5), which controls the spatial autocorrelation in the residuals, and thus in both dependent and independent variables.
where Wε is the error with spatial effect and ρWε is the measure of the autocorrelation of the errors of Y, and ξ the error component with constant and uncorrelated variance (white noise).
We applied the Lagrange Multiplier (LM) test to identify the model with the strongest explanatory power for the variable Y (Anselin 2005). This test estimates the LM-Lag for the dependency in relation to the original variables in the neighboring areas and the LM-Error in relation to the residuals in the neighboring areas. If significant, this indicates that a spatial regression may capture some of the spatial effects that affect the behavior of the Y variable; if both models are significant, the best model should be selected according to the Akaike Information Criterion (AIC) (Anselin 1996(Anselin , 1988Anselin and Bera 1998). We used the Jarque-Bera test to examine the normality of the distribution of the errors (Anselin et al. 2010). When a spatial model is utilized, it is recommended to check whether the dependent variable of the target location is influenced by neighboring locations (Lesage and Pace 2009). In this context, it is possible to identify spillover effects among neighboring municipalities. For example, evaluating whether the COVID-19 relative risk of a municipality is positively or negatively related to the RR of neighboring municipalities. Here, we use GeoDA (Anselin 2005) to compute the spatial regression models.

Emerging clusters, February 25-June 7, 2020
We detected 11 emerging space-time clusters of COVID-19 occurring in all Brazilian regions (p < 0.001) for the first period (February 25 to June 7, 2020). Among these clusters, three occurred exclusively in the north and northeast regions (Fig. 2). Seven clusters had a relative risk (RR) greater than 1 (i.e., more observed than expected cases). Cluster 1 (RR = 7.97) is located predominantly in the North region and the state of Tocantins (Center-West region) and includes 466 municipalities, with 293 municipalities showing a RR > 1. Cluster 2 (RR = 4.7) is found in the Northeast region and includes 584 municipalities, where 180 have a RR > 1. Cluster 3 (RR = 4.15) is found in the Southeast region, including São Paulo city and 34 municipalities, where 15 have a RR > 1. Cluster 4 (RR = 4.46) includes 274 municipalities and 48 cities with a RR > 1 and is also found in the Southeast region of Brazil, covering the states of Minas Gerais, Espírito Santo and Rio de Janeiro. Cluster 5 (RR = 5.05) includes 68 municipalities in the state of Bahia that is in the Northeast region of Brazil, where eight municipalities have a RR > 1. Cluster 7 includes only Brasília, the Capital of Brazil, with a RR of 4.39 located in Center-West region of Brazil. Finally, Cluster 9 (RR = 4.24) includes 230 municipalities located in Santa Catarina and Rio Grande do Sul states, where 51 municipalities have a RR > 1 (Table 1).  Table 2 shows the first three municipalities with the highest RR for each emerging cluster of COVID-19 identified in Brazil from February 25th to June 7, 2020. We found the highest relative risks in the Amapá state (cluster 1), in the North region of Brazil. The data presented in Table 2 highlighted only three of the Brazilian state capitals (São Paulo and Vitória) and Brasília, calling attention to the highest relative risks in countryside municipalities and some cities along the shoreline.

Emerging clusters, February 25-July 20, 2020
We detected nine emerging space-time clusters of COVID-19 with a RR > 1, which occurred in all Brazilian regions (p < 0.001) for this second period (147 days), while two more clusters were detected compared to the first period of analysis (Fig. 3). In addition, for this time series, the increase in COVID-19 cases in the countryside of Brazil is more evident (Fig. 3). In the first period, we observed 1658 municipalities within an emerging cluster; and in this new time series, we detected 3304 municipalities (an increase of almost 100%). This result shows that almost 60% of the Brazilian municipalities are within an emerging cluster with a RR > 1 (Table 4).
Regarding the number of municipalities with a RR > 1 within the emerging clusters from the first to the second period of analysis, we observed 958 municipalities, an increase of 60.7% (Tables 3 and 4). Table 5 shows the first three municipalities with the highest RR for each emerging cluster of COVID-19 identified in Brazil from February 25 to June 20, 2020. As observed in the first period of analysis, the municipalities with the We observed a critical situation in the Amapá State and Roraima State, where all the municipalities have a RR > 1. In addition to the states of Amapá and Roraima, the states of Amazonas (91.93%), Acre (86.36%), Pará (66.66%), Espírito Santo (62.82%), Ceará (60.32%) and Sergipe (56%) have been in a critical situation with more than 50% of the municipalities with a RR > 1 (Table 6). Some states are intersecting more than one cluster, such as the states of Maranhão (Clusters 1 and 2), Mato Grosso (clusters 1 and 5), Tocantins (clusters and

Regression results
The results obtained through the GLM showed that there was a significant positive correlation between the predictor variables in relation to the relative risk at the level of the municipalities belonging to the nine emerging clusters (Fig. 4).
The GLM result showed that all variables have a statistically impact on the RR of the municipalities located in the clusters. For each one-unit increase in the GINI variable, the relative risk increases by 2.02, while the impact of SVI is 0.82, and mortality rate is 0.021. For comparison purposes, we implemented the same model for all municipalities in Brazil regardless of their cluster membership, and all municipalities outside the clusters. When applied to all municipalities, the same predictor variables were significant; however, the estimates were substantially lower (closer to zero) with a decrease in the standard error. The decrease in the estimate was even more pronounced when the model was conducted to the municipalities outside of the clusters.
We identified problems of multicollinearity and heteroscedasticity (Appendix Table 13), in addition to spatial dependence on residuals, confirmed by the Moran's I statistic (Appendix Fig. 5). Although the Moran's I test showed a value close to zero (Table 7), it is still possible to identify spillovers with groups of high-high in the Brazil North region and others of low-low, mainly in the Northeast region. Table 8 also shows the result of the Lagrange Multiplier Test, which indicated the recommendation for applying spatial regression for LM-Lag and LM-Error regressions.
Considering both spatial regressions performed, LM-Lag showed the most adjusted model based on the AIC values (Appendix Figs. 6, 7), as well as in comparison to the GLM (Appendix Table 14). The result of the Moran's I test LM-Lag and LM-Error residuals was 0.274 and -0.033, respectively. Based on these results, we choose the LM-Lag spatial regression model, which is more adjusted   to assess how the variables GINI, SVI and Mortality Rate explain the variation in the RR of the municipalities located within the clusters. The results of the LM-Lag showed that all variables analyzed have a statistically significant effect for RR, except for the SVI (Table 9). According to the results of the model, the variables GINI and mortality rate have a significant effect (p < 0.001) on the relative risk of municipalities within the clusters. Table 10 shows the results about the direct, indirect, and total effects. The GINI variable has the greatest positive direct (1.280) and indirect (1.414) effects, resulting  in a total effect of 2.695. This indicates an increase in the RR within the municipalities due to inequality, but mainly due to the role of spillovers from neighboring municipalities. The same pattern was also observed for the mortality rate for the direct (0.015), indirect (0.016), and total (0.032) effects. In the case of SVI, no significant effects were observed for RR (Table 10).

Sensitivity of scan parameters
In this paper, we selected spatial and temporal scan statistic parameters in line with previous work (Desjardins et al. 2020;Hohl et al. 2020a, b). We conducted a sensitivity analysis by parameterizing different spatial and temporal scanning windows: (1) 25% of the population at-risk and 25% of the study period; (3) 10% and 50%, (3) 10% and 25%, (4) 5% and 50%, (5) 5% and 25%, and (6) 5% and 10%. In general, increasing the temporal scan results in clusters that start much earlier than smaller search windows (e.g., 10% versus 50%), essentially capturing a longer temporal range of cases. As such, increasing the temporal cutoff too much may obscure smaller space-time clusters that happened more recently in Brazil. Spatially, the general location of the clusters is very similar when decreasing the spatial scanning window from 25% to 5%. Larger spatial scanning windows may encompass multiple clusters that were detected using smaller windows (e.g., 25% versus 5%). This is due to the relaxation of the population constraint, allowing the windows to capture more municipalities which result in very large clusters in rural regions of Brazil (e.g., State of Amazonas). The spatial overlap of the clusters using different scanning windows provide some confidence that the COVID-19 clusters are "stable", despite the parameter selection.

Discussion
Comparing the first period (103 days) with the second (with an increase of 44 days), the results show a significant increase in the COVID-19 in Brazil, demonstrated by the 61% increase in the number of municipalities with RR > 1 within the detected clusters. A significant increase in the number of municipalities was detected, confirming the study by Peixoto et al. (2020) who showed that not only risk regions are those closest to the capitals where the outbreak began, but that there are also interior cities with risk. The choropleth maps presented in this paper (Figs. 2 and 3) are based on the calculations of the relative risk, which reflects the relationship between the total number of confirmed cases in relation to expected cases based estimated from the population of the municipality. In the first time period (February 25-June 7), municipalities with RR > 1 where we identified mainly in the North, Northeast and close to the coast associated with the occurrence of the main state capitals, such as São Paulo and Rio de Janeiro. However, in the second period (February 25 to July 20), there was an increase in the number of municipalities with RR > 1 in the North and also in the coastal regions. Besides that, there was also an increase in the number of municipalities with RR > 1 in the interior of the country. Costa et al. (2020) identified that some states had outbreaks that started mainly in their capitals, followed by epidemic waves that spread toward the interior, and that still other states have multiple initial outbreaks of epidemics.
The inequality (GINI) and mortality rate have direct, indirect, and total positive effects of relative risk detected in the Brazilian municipalities. This finding is related to the spread of COVID-19 to the countryside, where there is high social inequality, mainly in the municipalities of the North and Northeast Regions (74.3% of the municipalities with RR > 1). The states of the South and Southeast regions have a lower concentration of income, while the Center-West, North, and Northeast regions have higher levels of inequality (Colombo and Ferreira 2019). Coelho et al. (2020) highlighted that these areas would suffer an increased spread of the disease in populations with greater socioeconomic vulnerability.
When bringing to debate the political aspect of the virus, Smith and Judd (2020) consider relevant "to reflect on who is most vulnerable in pandemics". This question is based on the argument that, despite the fact that COVID-19 can affect the whole of society, its effects will be experienced in different ways, depending on the level of equity that exists in each social reality, as such it is essential to analyze the pandemic and the policies that emanate from it in the perspective, not only of health, but also of social and economic determinants (Smith and Judd 2020).
The municipalities with the greatest inequality will likely be the regions with the highest incidence and death of COVID-19 in Brazil. Our results indicate that inequality is a significant variable that explains RR increase in the municipalities with spatial spillover effects. The inequality of income and the lack of access to services are sufficient to suggest that there is a disproportionate effect of COVID-19 among the most vulnerable in the country (Pires et al. 2020). In addition, there is a lack of protocols and measures aimed at the social protection of these populations in the atypical context of a pandemic; so when clusters with high risk show a high mortality rate, this can guide decisions for these municipalities.
The relationship of active clusters with indices that express inequality in the country may represent the beginning of a problematic scenario, especially so in the most vulnerable municipalities. This may result from the difficulties of enforcing social isolation due to the needs of maintaining employment and income, as well as access to health and basic sanitation (Pires et al. 2020). In the study carried out by Fiocruz (2020) on April 2nd, 2020, the most vulnerable regions of Brazil were identified in the north and northeast. In our study, we found the highest relative risk in north and Northeast for all-time series, corroborating with Fiocruz (2020). Although the North Region of Brazil has low levels of urbanization, river migration is related to the spread and progression of the disease in the municipalities of the State of Amazonas, unlike other urbanized regions (Aleixo et al. 2020).
Examining the adherence to social distancing guidelines requires a more detailed analysis, which was beyond the scope of this study. Amazonian communities such as Indians and riverside populations are geographically isolated populations; however, they have been impacted by COVID-19 (ISA 2020). Therefore, actions need to be taken based on the geographic, social, and cultural differences than those living in urban areas. We present an exploratory study that identifies associations between the relative risk of COVID-19 clusters and mortality, inequality, socioeconomic vulnerability of the disease in Brazilian municipalities, but there are not enough elements to detail demographic particularities of the population.
An important correlation between clusters with high relative risk and mortality rates and inequality is observed, but the method is sensitive to the scale adopted (Chen et al. 2008); therefore, we hypothesize that the findings will be different and may be more severe regarding relative risk if finer-level data were available. The pandemic is still spreading in Brazil, and it is difficult to estimate the speed of transmission along the countryside, where small population municipalities are located. However, our research highlights the regions that are experiencing the highest risk of COVID-19, which is critical for improving public health decision-making. Preventive measures must be strengthened and adhered to, while the only strategy that has proved effective for the control of COVID-19 has been social distancing (de Oliveira et al. 2020).
Despite the strengths of this study, there are several of limitations worth addressing in future studies. First, the cylindrical scanning windows of the prospective scan statistic may not capture the true shape of the COVID-19 outbreaks. However, this is an exploratory study and cylindrical scanning windows are widely used and acceptable in the SaTScan literature. We encourage more research on developing irregular search windows, similar to by Tango and Takahashi (2005) and Wu and Grubesic (2010), but extended in time. Second, we do not provide daily surveillance of COVID-19 in Brazil, as this was beyond the scope of our research objectives. Rather, we provide two snapshots of the situation in South America's largest country. Future research can provide daily updates, similar to Hohl et al. (2020b). Third, there are inherent biases in the dataset that we used. Like many other countries, testing was not always accessible when the pandemic reached Brazil; as such findings may have suffered from this underreporting, although the country ramped up testing. Fourth, we used different covariates to explain the variation in relative risk (inside and outside clusters). One avenue worth modeling as future research is the persistence of a cluster over time, an approach suggested by He et al. (2017). This approach is worth pursuing when the timespan of study would be long enough. Finally, we did not adjust the p-values for multiple analyses; however, Kulldorff and Kleinman (2015) suggest that adjusting p-values should be avoided for long sequences of data, especially in a spatiotemporal context (i.e., increasingly difficult to reject null hypothesis).

Conclusion
This research presented an analysis of the dynamics of the expansion of COVID-19 based on the number of daily cases by municipality, with the intent of identifying emerging space-time clusters active in Brazil in the first five months of the pandemic. We detected nine significant active clusters of COVID-19 within Brazil on July 20, 2020. Therefore, this space-time approach to detect emerging clusters will allow decision-makers to identify statistically significant hotspots of COVID-19 cases. States are responsible for coordinating the activities at the regional health level. These regions can use these results to optimize coordination and organization of health care needs, specifically in relation to the poorest populations and those with the highest health-care demand. Our approach may also allow authorities to pay attention to municipalities that still have little-to-no cases, so they can be prepared to face the burdens of COVID-19. In turn, this can improve the management of resources to the States and Health Regions.

7490.79
Funding Open Access funding provided by University of Eastern Finland (UEF) including Kuopio University Hospital. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Compliance with ethical standards
Conflict of interest The authors declare that they have no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.