The heterogeneous regional effect of mobility on Coronavirus spread

The Coronavirus (COVID-19) pandemic struck global society in 2020. The pandemic required the adoption of public policies to control spread of the virus, underlining the mobility restrictions. Several studies show that these measures have been effective. Within the topic of Coronavirus spread, this original paper analyses the effect of mobility on Coronavirus spread in a heterogeneous regional context. A multiple dynamic regression model is used to control sub-national disparities in the effect of mobility on the spread of the Coronavirus, as well as to measure it at the context of Spanish regions. The model includes other relevant explanatory factors, such as wind speed, sunshine hours, vaccinated population and social awareness. It also develops a new methodology to optimise the use of Google trends data. The results reveal heterogeneity among regions, which has important implications for current and future pandemic containment strategies.


Introduction
On 11 March 2020, the World Health Organization (WHO) declared the emergency caused by the outbreak of a new Coronavirus  to be an international pandemic. Previously, Asian countries such as China (particularly the Wuhan region), South Korea and Singapore had been severely affected by the Coronavirus, necessitating actions to contain its spread. By early March 2020, Italy (with the outbreak in Lombardy) was the only European country that had applied restrictions. Subsequently, most Western countries applied more severe restrictions to contain spread. Spain, France, Germany, the UK, the USA, and many others applied new restrictions, the strictest being home confinements. These policies were extended throughout 2020 and 2021, with continuous changes due to the different waves and being reduced gradually as vaccination has progressed until the last months of 2021. In addition, restrictions have varied over time, including perimeter closures, time restrictions, capacity reductions or closure of certain areas.
The aim of these policies was to increase social distancing and reduce the personal contact responsible for Coronavirus spread [1,2]. One consequence of this policy is the reduction in economy activity as is closely a e-mail: jm.amoedo@usc.es (corresponding author) related to the spread of viral diseases (among them, Coronavirus) and to human mobility [2][3][4][5].
Various authors have analysed the mobility restrictions that most Western governments used to contain the pandemic. Studies follow three different lines that are closely related but use different methodologies and measures. The first line examines the effects of mobility restrictions on evolution of Coronavirus spread, focussing on lockdown policies [2,4,6]. 1 Such papers perform a two-step analysis, first examining the efficacy of mobility restrictions by measuring the changes in human mobility and then analysing the correlation between mobility and Coronavirus spread. The second line analyses the effect of mobility on Coronavirus spread without specifically evaluating the restrictions [1,[7][8][9]. Other works pursue questions closely related to mobility, such as transport accessibility [10].
Socioeconomic, physical or biological studies of the factors that determine spread of the illness have focussed on analysing individual countries [1,[10][11][12][13], groups of countries [2,[14][15][16][17], and to a lesser extent regions and cities [6][7][8][9]. The literature has shown the effect of specific factors (mobility, weather, vaccination process, social awareness, lockdown of areas with high concentrations of people or transport accessibility, among others) in specific geographic areas by analysing the average effect of each factor in each region. However, socioeconomic studies of regional scope should consider the possibility of heterogeneous behaviour in different regions in the presence of certain factors, as occurs with mobility restrictions.
In this way, the novelty of this paper is to control sub-national disparities in the effect of mobility on the spread of the Coronavirus, as well as to measure it. Moreover, it includes the variable of vaccinated people in the analysis.
The study focuses on Spanish regions for three main reasons: the high incidence of COVID-19 in Spain, the high level of decentralisation and thus regional heterogeneity, 2 and the high level of vaccination. Such knowledge is very important to adapting national and regional government actions to the reality of each territory.
The factors that explain the spread of Coronavirus can be grouped into the following four groups: mobility, deeply related to social distancing; social awareness, environmental aspects and other factors.
Concerning the first group of factors, different authors have analysed the impact of human mobility and economic activity on the spread of viral diseases before the Coronavirus pandemic [3,18] and in its wake [4,12,19]. Using datasets from France, Adda [3] observes a positive relationship between interregional trade and faster spread of a viral disease. The study shows how the globalisation process (expansion of transportation networks and growth in trade) can explain part of the rise in transmission rate of viral diseases in the last quarter of the twentieth century. Epstein et al. [18] focus on restrictions on passenger flights and the spread of flu in the United States, showing how flight restrictions, combined with other restrictions, slow spread but at an economic cost. Analysing the effect of mobility on Coronavirus spread, Kuo and Fu [12] demonstrate that reopening business activities accelerated spread. Another interesting question, studied by Adda [3] and Kuo and Fu [12] involves the need for restrictions to contain the spread of viral diseases. These restrictions are closely related to the economy and the human mobility. Thus, the trade-off between public health and economic activity should be considered. Most restrictions limit mobility (people and trade) and personal contact (social distancing), closing pub-2 Heterogeneity is shown in different fields, such as population concentration, economic activity, population structure, communication routes and decentralisation of policies, among others. In addition, other fields are important as the difference in weather factors or the decentralisation of healthcare (including the vaccination process), among others.
lic spaces and institutions with high concentrations of people (e.g., schools, sport stadiums, cultural events) or closing shops and businesses [2,12].
The main conclusion of previous research on this topic is the need for public action, involving mobility restrictions, to contain the spread of any viral disease that constitutes a special potential danger for society. A few studies of mobility restrictions and lockdowns in different parts of the world support this conclusion in the case of the Coronavirus [1,2,4,6,9,12,19]. A considerable number of them analyse countries and regions strongly affected by the pandemic in its initial stages. China (specifically Wuhan) and Italy have received the most attention. Other researches reach the same conclusion using international datasets [2,20]. Nevertheless, imposing strict mobility constraints on movement of people and freight within and beyond national boundaries in the modern and global era is detrimental to the economy and to business development [2].
With respect to the first country affected, China, various works on issues such as the Wuhan lockdown reach essentially the same conclusion. Mobility restrictions are effective if the key aim is to contain Coronavirus spread 3 [6,9]. Other papers stress the effect of mobility on the spatial and temporal spread of Coronavirus, showing how mobility generates different numbers of infections and deaths at different stages of the pandemic [8].
Research of Italy and China reach very similar conclusions. Mobility plays a central role in the spread of Coronavirus in Italy's regions, as do other factors, such as temperature, air pollution and nearness to outbreak points [1]. Other authors consider transport accessibility as the most significant factor explaining the Coronavirus pandemic [10]. In parallel, other works concur on the importance of mass transport in Coronavirus spread [2] and the need to reduce its use through stay-at-home policies [21]. Many of these aspects will be discussed below and some of them will be included in the empirical analysis.
As abovementioned, the Coronavirus pandemic has had a global impact, but it has struck some countries harder than others. Although China was the first country affected, it was not ultimately the hardest hit. The US, the UK, Italy, and Spain were the countries most affected during the first stage. Later, other countries as Brazil or India were harder hit. The greater severity of the pandemic in these countries can be explained by social and institutional factors, such as a higher proportion of elderly people, employment in the service sector and globalisation [2] and the differences in the time can be explained by the climatic seasons, among others. This study also shows the positive effect of mobility restrictions. It finds that reducing mobility in transit stations (TS), retail and recreation facilities (RR) and workplaces (WP) reduces 4 pandemic severity 5 (PS). Conversely, increased mobility in residences (RD) leads to a reduction in pandemic severity.
The analysis of the datasets of five large cities in the US shows striking differences among cities in the effect of mobility reduction on the decline in cases per capita. New York City is the most significant case, and lockdown policies again play an important role [7]. Bonaccorsi et al. [22] analyse the effects of mobility restrictions on economic activity (which is closely connected to human mobility) in Italian cities but neither establish nor measure the effect of these restrictions on spread of the virus. Cartenì et al. [1] use mobility among Italian regions but obtain an average effect for the national government. In sum, the regional analyses in the mentioned studies do not consider the possible heterogeneity of regions or measure the impact of mobility in each region.
There are two reasons for these omissions. The first reason involves both the use of cities, a different concept than the region, and the use of different measures of mobility and variables for each city. This approach prevents researchers from isolating the effect of mobility. The second reason is not measuring the impact of mobility on Coronavirus spread in each region. This occurs because the study does not establish the relationship but only the impact of mobility restrictions or calculates the average impact for all regions.
Regarding the second groups of factors, Milani [15] and Tiwari et al. [23] identify social awareness as another important factor in Coronavirus spread. As the pandemic advanced, public attention and fear of Coronavirus changed. Social awareness is significant in the fight against Coronavirus spread. Prior studies, such as Geoffard and Philipson [24], argue the need for social awareness in the fight against Human Immunodeficiency Virus (HIV). A higher social awareness of the pandemic personal and social consequences is usually associated with greater self-protection and respect for third parties safety. Such awareness impacts pandemic severity negatively, reducing new cases and deaths. Several recent works have analysed this relationship in different countries and regions using Google Trends datasets with different terms and topics [14,15,25].
Google Trends enables measurement of the population risk perception by selecting any terms closely related to the pandemic. This indicator of social awareness may be more accurate than official new cases or deaths, which are an imperfect measure [15]. Fantazzini [14] analyses risk perception in 158 countries, showing that this indicator provides useful information for measuring social awareness using the topics "Coronavirus" and "pneumonic" as health categories.
Most existing research examines the impact of fear on the stock market and financial economy. This research demonstrates that an increase in Google searches with different terms related to COVID-19 (e.g. "Coronavirus", "corona", "World Health Organization", "virus", "COVID-19", "Symptom", "unemployment", "laid off") is significantly and negatively associated with variations in stock market prices [26][27][28]. It also uses Google Trends data to measure other issues, such as the level of self-protection against the virus [25] and public attention. These issues are closely related to communications from public health authorities [29].
A common and relevant aspect in the literature discussed in the preceding paragraphs is the study of early phases of the pandemic. In these initial phases, the lack of knowledge about the Coronavirus was remarkable, thus, the effects of Social Awareness should become less important over time and even disappear.
With respect to the third group of factors, several studies analyse weather conditions, such as temperature [30], humidity or precipitations [30][31][32][33], daylights hours [31][32][33][34], wind speed [31,32,35] or air pollution [1]. The weather conditions are deeply interrelated. Specifically, high temperature and low relative humidity generate a significant reduction of virus spread. The second factor is more relevant, because the droplet cloud travelled distance and concentration remain significant at any temperature if the relative humidity keeps high [30,31]. There is an exception to this, since the aerosol particles increase in high temperature and low humidity environments leading to the Coronavirus spread [32].
Other works include sunshine hours as an alternative factor to explain the Coronavirus spread [31,34]. More sunshine hours increase the number of new cases and deaths [31]. Sagripanti and Lytle [34] indicate that Coronavirus aerosolised form infected patients and deposited on outdoors surfaces may remain infectious for a considerable time during the winter in many temperate zones. Specifically, 90% of Coronavirus particles are inactivated after 11-34 min of exposure to sunlight in most parts of the world during the summer. In contrast, the virus will persist infectious for at least one day in winter.
The last analysed factor is the wind speed, which increases the number of cases and deaths [31]. The improper airflow and higher wind velocity can strongly increase the travelling distance of aerosol particles and droplets, the needed social distancing, and the risk of Coronavirus transmission [32,35]. A limitation to the increase of virus spread generated by the wind is the wearing face coverings (generally masks). Chea et al. [35] analyse the use of mask (specifically, N95 mask) in the Coronavirus spread, finding It highly effective.
Concerning the last group of factors, some studies analyse issues that are related to society and institutions, diagnostic capability, or vaccination process. Concerning socioeconomic and institutional factors related to Coronavirus spread, population density [1,2], elderly people, globalisation level, employment in the service and agricultural sector, and education level [2] are common. Other relevant factor is the number of swabs performed, which is an important control variable [1], because it is closely related to diagnosis of Coronavirus.
Finally, the vaccination process, has been examined to a lesser extent due to its recent onset. Previously to the start of the vaccination process, two main groups of studies can be identified. First, some authors study issues, such as the herd effect, showing the risk inherent to the strategies that seek it as main goal [36] or the necessary requirements to achieve this [37]. Second, other works analyse the effect of the population ratio vaccinated for other diseases (used to similar respiratory viruses) in different countries, showing the significative reduction in Coronavirus spread [38]. Recently, different papers directly study the effect of vaccination process in the Coronavirus spread and mortality. These studies show the reduction in Coronavirus spread and mortality generated by the vaccination process (increase of vaccinated population with one or two doses) due to the increase of immunisation. Some works highlight the need to combine the vaccination process and the mobility restrictions to avoid the Coronavirus spread [39,40].
The literature review suggests the significance of various factors in Coronavirus spread. After analysing these factors, empirical analyses were performed to test their incidence in Spain from a regional perspective [39,40].

Materials
As stated above, one aim of this paper is to investigate the influence of mobility restrictions on Coronavirus (COVID-19) spread during the pandemic. The unit of regional analysis was NUTS 2 for Spain. The data used in the estimates were collected from the following sources: March and April were omitted due to a change in the data collection methodology by the Gabinete de Prensa del Ministerio de Sanidad, Consumo y Bienestar Social del Gobierno de España [52]. 7 Although the primary data source is cited here, a dataset cleaned and standardised by Merelo [45], available on his github user, was used in the analyses. 2020 and 15 November 2021 8 (described in Appendix A.1, Table 5).
A pool data with 10353 observations and 17 regions during a period of just over 22 months was created using daily data. The pooled data provided a more complete database and enabled control of individual heterogeneity and more accurate identification of the adjustment dynamics.

• Explanatory variables
Main objective: habitual mobility 9 was used to study mobility habits. This variable represents the percentage of population that leaves its area of residence 10 during working hours in each region of Spain. It serves as a proxy for variation in labour mobility and is based on aggregate data (total origin-destination flows) [46].
• Other control variables: Social awareness: captures the search interest of the term "Coronavirus" throughout the analysed period. Google Trends provides a standardised time series, in which the day with the highest relative number of searches takes the value of 100. The regional values are also standardised, assigning the value 100 to the region with the highest relative volume of searches throughout the period. One limitation is that Google Trends API provides weekly or monthly rather than daily data for 8 The time period used has subsequently been shortened, due to the reduced time effect of this variable (noted above), to the period from 15 March 2020 to 31 October 2021. 9 Habitual mobility estimates the percentage of people who leave their area of mobility during working hours (10:00 AM to 4:00 PM) for more than 2 hours on the same day. 10 Area of residence is the area in which cell phones in Spain are located for the longest time between 00:01 AM and 06:00 AM for at least 60 days. The sample includes about 80% of the mobile market. The 3200 mobility areas identified are defined as population clusters of 5,000-50,000 inhabitants. In a depopulated area, the mobility area would be the sum of several small or very small municipalities (up to 5000 inhabitants). However, in cities, these areas could be districts or even parts of districts.  periods longer than 3 months. This limitation makes it impossible to obtain daily panel data for series of longer than 3 months, greatly reducing the utility of this data source.
In view of the foregoing, this paper develops a new methodology to overcome this limitation by combining three processes. First, quarterly regional data are panelised using Coello [47] repository. Second, the data are connected by a simple chain index [48]. Finally, the data are standardised using the procedure proposed by Narita and Yin [49] (see Appendix A.2 for more information).
It means a relevant methodological contribution, which provide valuable tools for future studies that use Google Trends as an information source. The process developed here to obtain a daily data pool for periods longer than 3 months enables the use of Google searches in many fields of knowledge for long periods, extending the study by Narita and Yin [49].
• Sun: measured as the average of the sunlight hours of all the meteorological stations collected in AEMET [44] for each of the regions. • Wind speed: measured as the average of the wind speed of all meteorological stations collected in AEMET [44] for each of the regions.
• Vaccination: measured by the percentage of vaccinated population [45]. • Region: dummy variable for each of the 17 analysed regions.

Methods
The goal of this study is to estimate the effect of mobility on spread of the Coronavirus in a geographical area of regional scope. Thus, the study aims to examine, not only to control, the magnitude of possible regional heterogeneity. This goal requires using parametric estimation. Among the feasible alternatives, linear parametric estimation is the most appropriate because it is easy to understand and interpret, and because it fits more complex nonlinear specifications well. The analysis starts from a pooled specification: where subscripts i refers to the region (i = 1, . . . , 17), variables (j = 1, 2, 3, 4). Least Squares (LS) procedure is then performed. To confirm the appropriateness of the proposed equation, one must first contrast the hypothesis of heterogeneous effects of mobility on spread of Coronavirus among regions. This contrast is performed using the Wald Test [50] to determine whether homogeneity exists between parameters α i . The null hypothesis is rejected, affirming that the effects of mobility differ among regions.
Since the lagged dependent variable is correlated with the errors, even if there are no autocorrelation problems, the estimators used in static models would be inconsistent. Therefore, it would be necessary to resort to the use of estimators based on the Panel Generalized Method of Moment (GMM) [51]. 11 Because this is a daily sample, the Durbin Watson Test is used to detect possible self-correlation problems. 12 The result of the alternative estimation is the Panel Generalized Method of Moment 13 (GMM) presented in column (2)

Results
The results of estimating Eq. (1) using different methods are presented in Table 1.
The variable mobility habits was estimated independently for each region, and globally for all the regions (variables Mobility Habits), in both models to obtain each regional effect. Thus, the estimated coefficients for each region must be added to the overall effect. To avoid multicollinearity, one region (specifically Andalusia) has been omitted, with its estimated effect corresponding to that of the global variable. The other variables were estimated jointly for all regions.
In both estimations, mobility habits (habitual mobility) is positively related (in most regions) to Coronavirus spread after 14 days. For example, in Aragon, an increase in mobility habits of one percentage point (14.49%, mean Appendix A.1, Table 4) induces an increase (statistically significant) of 4.454 cases 14 per 1,000,000 inhabitants (mean of 100 in the analysed period) in Coronavirus spread 14 days later. In other words, if mobility habits decrease by one percentage point in Andalusia, the reported cases of COVID-19 in Andalusia can be expected to decrease by 6 people (over the total population) (all regional data are presented in Appendix A.1, Table 8). These results are similar to the findings of Cartenì et al. [1]. One major conclusion from the estimated coefficients of the effect of mobility habits on Coronavirus spread is the regional heterogeneity of these effects. Map 1 presents this regional heterogeneity on mobility. The effects in the regions of Cantabrian coast (northern and north-western Iberian Peninsula, including Galicia (2892), Asturias (2929) and Cantabria The two archipelagos (Canary and Balearic Islands) have similar situations (2290 and 3326, respectively), with very low effects of mobility, mainly on the first one (Fig. 1).
If social awareness increases by one percentage point, the severity of the pandemic decreases by 0.129 cases (in dynamic estimation) 10 days later. These values establish a inverse relationship, as expected and in line with other studies [15] but not significant. This could be due to the loss of effect over an extended period such as the one under discussion.
An increase in Sun of one daily sunlight hour induces a decline (statistically significant) of 0.815 cases per 1,000,000 inhabitants 14 days later. If wind speed increases 1 km/h the severity of the pandemic decreases by 0.262 cases 1,000,000 inhabitants 14 days later, being this not significant.
If population vaccinated increases by one percentage point, the new cases 44 15 days later decrease significantly in 0.524 cases per 1,000,000 inhabitants

Conclusions
This paper analyses the heterogeneous spread of Coronavirus in Spanish regions. To study the relationship between mobility and Coronavirus spread, mobility was measured by number of people who leave their geographic area, understood as the surrounding area inhabited by 5000 people. This mobility is closely related to the lockdowns decreed by Spanish regions at different stages of the pandemic, measures that attempt to contain the pandemic and reduce cases. This research found that these measures do not have the same effect in all regions, as the relationships between nearby areas differ greatly.
The analysis shows the heterogeneous effect of Coronavirus on Spanish regions. This paper confirms that daily new Coronavirus cases are directly related to mobility habits 14 days before. Other issues, such as social awareness, weather factors and vaccinated population, are also relevant in explaining Coronavirus spread. This paper also considers the percentage of vaccinated people as a relevant factor in explaining the Coronavirus spread. The inclusion of this key aspect means a step beyond previous studies. Specifically, the high significance of the percentage of vaccinated people in reducing the number of new infections is confirmed. It should be noted that Spain is among the countries with the highest vaccination rates.
The policy implications of these heterogeneous effects suggest applying different measures at regional level. Then, establishing mobility restrictions in the NUTS of central Spain would have a stronger effect. During the first stage of restrictions (March-June 2020), the same measures were applied throughout Spain. Afterwards, different levels of de-escalation were applied based on other criteria, such as population size. Later, other restrictive policies (such as time restrictions, capacity restrictions or geographic closures) were applied with the same objective. Subsequently, the measures were primarily decided and applied at the regional level, allowing governments to tailor their policies better to regional specificities.
The main limitations of this research involve the availability and quality of the data. More granular regional data were needed for mobility areas.
Finally, analysing the implications of heterogeneous context for restrictive policies at different regional and local levels requires further investigation. Future research could also extend this study to develop new modelling that includes more territorial variables to control for spatial heterogeneity.
Acknowledgements The authors are grateful for the funding of the Fundación Segundo Gil Dávila to José Manuel Amoedo.
Funding Information Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.