1 Introduction

Spatial agglomeration is a central aspect of human life and of the geographic space in which most economic and social exchanges take place (Bairoch 1988). The size and shape of this geographic space have key implications for policy design, as they affect the regular patterns of mobility and interactions of people, goods, and ideas. This functional reality is weakly captured by the usual political–administrative units. Functional territories, as we call them in this study, represent a complex socio-spatial picture of overlapping markets between “areas or locational entities which have more interaction or connection with each other than with outside areas” (Brown and Holmes 1971, p 57), and with high frequency of economic and social interactions between their inhabitants, organizations, and firms (Berdegué et al. 2011).Footnote 1

The delimitation of these functional territories is a decisive factor in several important issues in urban and regional economics, such as proper identification of places for place-based development approaches (Pike et al. 2011; Berdegué et al. 2014a), estimating agglomeration effects and human capital externalities (Dingel et al. 2018), and measuring spillovers from urban to rural areas (Berdegué et al. 2015). In general, the size and shape of these spatial units may have an important effect on a wide range of economic geographic estimations (Briant et al. 2010). However, despite the importance of delimiting these areas to identify the economic effects and spatial scope of public policies appropriately, these methods have not advanced at the same pace as other sources of information or big data sources.

Since the 1950s, the analysis of interactions between spatial units has been measured mainly using labor-commuting flows (Klove 1952), and by a wide variety of methodological procedures, such as cluster analysis (Tolbert and Killian 1987), threshold methods (Coombes et al. 1986), network-based methods (Kropp and Schwengler 2016), and, more recently, an evolutionary approach (Casado-Díaz et al. 2017), among others (Duranton 2015). However, recently, some scholars have proposed delineating functional areas (usually metropolitan areas or cities) by using satellite images of nighttime luminosity (Bosker et al. 2018; CAF 2018; Dingel et al. 2018; Henderson et al. 2003; Vogel et al. 2018). Owing to the increasing availability of information from satellite images, nighttime satellite images have been used for different purposes, including monitoring urban extension and expansion (Cheng et al. 2016; Goldblatt 2016), and estimating local economic activity (Donaldson and Storeygard 2016; Henderson et al. 2012).

In this context, we present a simple and easily replicable approach to identify functional territories in Chile, Colombia, and Mexico by applying a combination of the two most common empirical approaches to delineate functional areas: nighttime satellite images and clustering using labor-commuting data. Specifically, we propose a two-step method. The first step of this method uses nighttime satellite images and applies a supervised procedure, common in the literature on remote sensing, to identify the boundaries of urbanized areas or urban continuums (conurbations), which can be contained within one or several different administrative areas. The administrative units that are overlapped by the same urbanized continuum are then aggregated as one forming a new area for the second step. The second step calculates a dissimilarity index using bidirectional labor-commuting flows between the resulting areas of the first step and then applies a standard clustering approach to delineate the definitive functional territories.

As a result of these two steps, the functional territories we identify include not only urban or metropolitan areas, but also, in general, administrative areas that share frequent economic and social interactions that are partly captured by the existence of common labor markets. Moreover, these functional territories include rural areas together with small urban areas that share strong connections, subjects that do not usually receive much attention in the literature, but that are important for rural economic development studies (Berdegué et al. 2015; Berdegué and Soloaga 2018; Soto et al. 2018).

We present the application of our method in Chile, Colombia, and Mexico, with data on labor commuting from censuses in 2002, 2005, and 2010, respectively. These three Latin American countries have similar economic conditions, but important geographic, demographic, and institutional differences, which influence the number and size of the functional territories obtained using our method. Our method differs considerably from standard approaches using only labor-commuting flows.

The rest of this paper is organized as follows. Section two reviews the literature on the delimitation of functional spatial units. Section three describes the empirical approach. Section four summarizes our main results. Section five summarizes what we consider the gains of our analysis. Finally, section six presents our concluding remarks.

2 Literature review

The literature on the definition and identification of functional spatial units in general acknowledges that the objective is to distinguish “…locational entities which have more interaction or connection with each other than with outside areas” (Brown and Holmes 1971, p 57). Based on the nature of the data and the purpose of the analysis, many empirical constructions refer to this type of spatial configuration, although it does not usually match administrative areas. Examples are as follows: commuting zones (Tolbert and Sizer 1996), functional regions (Berry 1968; Brown and Holmes 1971; Florida et al. 2008), functional economic areas (Jones 2016; Fox and Kumar 1965), functional urban areas (OECD 2002), labor market areas (Tolbert and Killian 1987; Tolbert and Sizer 1996), travel-to-work areas (Coombes and Openshaw 1982; Coombes et al. 1986), and functional territories (Berdegué et al. 2011, 2015; Fergusson et al. 2018), among others (Green 2007; Rozenfeld et al. 2011; Banai and Wakolbinger 2011).

The objective of maximizing interactions inside the defined area while minimizing interactions outside the areas implies that it is ensured that the resulting spatial units have the property of being self-contained (Coombes et al. 1986). This aim is reinforced when a functional spatial unit fulfills the basic restrictions of coherence, partition, and contiguity (Casado-Díaz and Coombes 2011). On the one hand, coherence states that the boundaries established for a functional spatial unit must be recognizable and correspond to a political or administrative configuration.Footnote 2 On the other hand, partition indicates that a spatial unit must belong to only one functional area. Contiguity is necessary to avoid fragmented spatial units. These considerations may have important econometric implications when the subject of identification is place specific, such as the identification of agglomeration effects (Henderson et al. 2018), spatial sorting of workers in cities (Dingel et al. 2018), and market access effects (Bosker et al. 2018).Footnote 3

The distance decay nature of commuting flows has the advantage that greater flows are usually found between neighboring spatial units (Casado-Diaz et al. 2000; Simini et al. 2012), which frequently enables the accomplishment of self-containment without much supervision. This characteristic makes commuting flows the most common source of data for the delimitation of functional areas, together with the fact that they also represent a broad socio-economic scope of individuals, being informative of other social, cultural, and political linkages encompassed within the concept of functional territory (Berdegué et al. 2011).

Notwithstanding the predominance of commuting flows to delineate functional spatial units, other types of data are also explored in the literature, such as land prices (Bode 2008), travel time and transportation networks (Weiss et al. 2018), mobile phone data (González et al. 2008), job applications (Manning and Petrongolo 2017), shopping data (Andersen 2002), gridded population census (Rozenfeld et al. 2011), global geographical population distribution estimatesFootnote 4 (Henderson et al. 2018), and satellite images (Bosker et al. 2018; Goldblatt et al. 2016; Imhoff et al. 1997; Small et al. 2005).

Disregarding the fact that functional units are usually sensitive to the method and data used (Bosker et al. 2018; Rubiera-Morollón and Viñuela 2012), they have for many decades played an important role for scholars and policymakers in developed countries (Coombes and Openshaw 1982; Coombes et al. 1986; Klove 1952; OECD 2002; Tolbert and Killian 1987; Tolbert and Sizer 1996). By contrast, attempts to incorporate the concept of functional areas into political and research agendas are still incipient in developing countries, with some recent exceptions (Berdegué et al. 2011; Bosker et al. 2018; CAF 2018; Casado-Díaz et al. 2017; Dingel et al. 2018; Henderson et al. 2018). However, particularly in Latin America, most of these studies rely on outdated censuses, and therefore, the lack of data is a limiting factor even for identifying metropolitan areas or conurbations.

Besides labor-commuting flows, remote sensing data also allow researchers to define contiguous areas with a high degree of economic interaction or self-containment. The rapid expansion of cities and the increasing quality of images captured by satellites have allowed researchers of urban studies to obtain very precise estimations of: the extent of urban areas (Goldblatt et al. 2016), global high-resolution estimations of population (Dobson et al. 2000), accessibility to cities (Weiss et al. 2018), characterization of land use (Gao et al. 2017), and other uses (Ma et al. 2017). All these tools have been receiving increasing attention in the literature in applied economics (Burgess et al. 2012; CAF 2018; Costinot et al. 2016; Donaldson and Storeygard 2016; Henderson et al. 2017). However, the application of satellite images to construct functional spatial units is still in progress, with a few recent exceptions (Bosker et al. 2018; Dingel et al. 2018; Henderson et al. 2018; Vogel et al. 2018).Footnote 5 The present work contributes to this knowledge by offering a simple and flexible method applied simultaneously to three countries: Chile, Colombia, and Mexico.

3 Empirical approach

3.1 Data

This section describes the sources and data used to delimit functional territories. For each country, we use the latest census available with information on labor-commuting flows at the municipality level (2002 for Chile, 2005 for Colombia, and 2010 for Mexico), together with satellite nighttime luminosity images obtained from the Defense Meteorological Satellite Program Operational Linescan System (DMSP-OLS) of the United States Air Force.

3.1.1 Commuting flows

The commuting flows matrix at the municipality level considers 2446 municipalities in Mexico, 1124 in Colombia, and 346 in Chile. Commuting flows vary substantially among countries, as shown in Table 1. The average number of commuters in the municipality of origin and destination is larger in Chile (origin: 5285; destination: 5347) than in Mexico (origin: 3270; destination: 2840) and Colombia (origin: 1001; destination: 952). In addition, Chile has more dispersion in the number of commuters (origin: 13,963; destination: 23,190) than Mexico (origin: 14,092, destination: 15,623) and in Colombia (origin: 5300; destination: 7143).Footnote 6 Meanwhile, the percentage of commuters as a proportion of the workforce in the municipality of origin shows that in Mexico, on average, 16.1% of the municipal workforce is composed of workers who commute to other municipalities, while it represents about 6.7% in Chile and 5.3% in Colombia. These values show an important dispersion, especially in the case of Mexico. At the same time, commuting can be as large as 77.3% of a municipality’s workforce in Mexico, 63.8% in Chile, and 52.2% in Colombia.

Table 1 Statistics of commuting flows

On the contrary, on average, 9.5% of the workforce of a municipality in Mexico comprises workers who commute from other municipalities. This amount reaches 7.0% in both Colombia and Chile.

3.1.2 Night light data

We use the average visible, stable lights, and cloud-free coverage composite. This information comes from a satellite that follows a sun-synchronous orbit at an altitude of approximately 830 km and covers any point on Earth once or twice a day depending on the latitude.Footnote 7 The composite images measure the light intensity from sites with persistent lighting, such as cities, towns, or flares (these flares are cleaned afterwards), and ephemeral events, such as forest fires, are discarded (Lowe 2014).

Although the dataset is available yearly starting from 1992, we use only the 2013 composite image, as it was the latest available year at the time of the undertaking the research. The stable satellite night light images are composed of several 1-km2-sized pixels, each one with a light intensity value (digital number) that varies from 0 (when the pixel is basically unlit) to 63 (when the pixel is saturated by light, usually in dense and rich areas) (Henderson et al. 2012). Note that these night light data may overestimate urban boundaries and may be affected by different stages of economic development, climate, and geological differences when performing cross-country comparisons (Henderson et al. 2003).

3.2 Methodology

Our methodological approach consists of two main steps. We start with municipalities as basic units. The first step uses a municipality-polygon map that is overlapped with a geo-referenced nighttime luminosity image. We try different intensity thresholds (cut-offs) that result in light continuums of different size. When the light continuums extend over more than one municipality, we group and redefine them as a new single spatial unit comprised of the sum of these municipalities. Incoming and outgoing commuting flows are then recalculated considering that these municipalities now form a new single spatial unit.

The second step calculates a dissimilarity coefficient for each pair of spatial units, defined as the ratio between the bidirectional commuting flow and the minimum labor force in the two areas. For each country, the commuting matrix contains commuting flows for all municipalities whose urban areas are totally comprised within their boundaries, as well as for those new spatial units in which a light continuum overlaps two or more administrative areas following the first step. After computing the matrix, we apply a hierarchical clustering procedure (average linkage) using different thresholds for the dissimilarity coefficient. These two steps are explained in more detail below.

3.2.1 Identifying functional territories using night light data

For this step, we choose to use a supervised method because it is easier to replicate and less computationally demanding.Footnote 8 The use of supervised algorithms to identify urban areas has been widely applied in remote sensing (Goldblatt et al. 2018; Imhoff et al. 1997; Ma et al. 2017; Small et al. 2005), and within similar recent applications in economics (Ellis and Roberts 2015). The first step identifies the location and boundaries of urban settlements using stable satellite night light images. For all countries, the light threshold is selected according to the correspondence between night light satellite images and the urban areas observed through Google Earth imagery. This is described in Fig. 1 with an example of Mexico, showing two different cut-off thresholds of light intensity in the yellow and green lines, and the urban area covered by that lit area in each case.

Fig. 1
figure 1

Different night lights intensity thresholds

Figure 1 describes how municipalities are merged in the first step. For this purpose, we overlap a map of political–administrative boundaries at the level of municipalities (red lines) and test different thresholds to detect, in each case, the remaining lit areas (grouped pixels) or light continuums that extend beyond these boundaries. We then merge the municipalities that contain a part of the same light continuum into a single functional area. In the second step, we consider this group of municipalities as a single spatial unit.

These actions have several caveats to be discussed. For example, although light continuums cover the whole areas of many municipalities, especially in the larger metropolitan areas, they overlap only a part of the municipal area in multiple scenarios. In these cases, one could argue that peripheral municipalities have a large rural part (unlit in the satellite images) that we might not want to aggregate, as they might lead to overestimating the size of cities. Nonetheless, note that municipal discretization might not be too harmful in our case, since the essence of our approach is not to measure the extent of urban areas, as is the focus of, for example, Henderson et al. (2003) and Vargas (2017), rather than to construct functional territories (i.e., urban and rural areas with higher levels of interaction relative to other areas).Footnote 9

An additional reason to carry out a municipal discretization lies in the fact that commuting data and many other economic and demographic statistics are collected only at municipal level (and apparently will be so for years to come), in most Latin American countries. Consequently, we must preserve municipalities as the basic units to be clustered into larger analytical regions, if we wish to implement the commuting clustering procedure in the second step, or if we aim for our resulting functional areas to have public policy implications.Footnote 10

As an alternative midpoint, we could restrict the merging of municipalities to cases in which the light continuum overlaps the centroid of the municipality’s polygon, as many coverage location models do (Alexandris and Giannikos 2010; Wei 2015). However, we choose not to do so, because in peripheral municipalities, the coordinates of the centroid seldom match or correlate precisely with the geometry of an urban continuum or with the most densely populated places in the municipality.

Having discussed the implications of municipal discretization, we return to Fig. 1, where contiguous lit areas are shown for 35 and 50 light intensities (green and yellow lines, respectively). The figure shows that for a light threshold of 35, the urban area may spread to include those merged polygons of four municipalities: Leon, Silao, Guanajuato, and San Francisco del Rincón. Following a light intensity of 50, only the urban areas Leon and Silao are merged into a functional area, whereas Guanajuato and San Francisco del Rincón remain alone. Once all those metropolitan areas and conurbations have been identified for the whole country, larger areas may be constructed by collapsing all those municipalities that share a common lit area into a single spatial unit.

Eventually, a single municipality could contain small portions of two or more lit areas. To deal with this issue, we establish a decision-making criterion based on computing the share of each lit area as a percentage of the total municipality-polygon area and use the one with the larger share to make the allocation. Figure 2 describes this using an example for Irapuato, a Mexican municipality with two lit areas. The largest one comprises 15.8% of the total municipal area, whereas the other lit area represents only 1.2%. The same figure shows a case in which the largest lit area of a given municipality (in this example, Irapuato) crosses the boundary of one of its neighbors (in this case, Salamanca), but this overlap is unimportant when compared to that of the other lit area in the neighbor (0.03% versus 12.2%). For cases like these, municipalities were not paired up.

Fig. 2
figure 2

Night lights and municipal boundaries

3.2.2 Identifying functional territories using census commuting data

After identifying the municipalities with single urban cores and those arranged into a single group of municipalities that share a light continuum, we follow Tolbert and Killian (1987) to compute a symmetrical dissimilarity matrix D, where each cell of the matrix is defined as \(D_{ij} = 1 - P_{ij}\), with

$$P_{ij} = \frac{{f_{ij} + f_{ji} }}{{\hbox{min} \left( {f_{i} ,f_{j} } \right)}},$$

where \(D_{ij}\) is the dissimilarity score between area \(i\) and area \(j\), and \(f_{ij }\) and \(f_{ji}\), represent the amount of people who reside in \(i\) and work in \(j\), and the amount of people who reside in \(j\) and work in \(i\), respectively (regardless of whether these two areas are single municipalities or groups of municipalities merged into a single spatial unit). The sum of these two flows in the numerator allows us to represent the total degree of interconnection between the two areas (or bidirectional commuting flows) instead of the directionality of the relation between them. The denominator, on the other hand, is specified as the minimum between the labor force of the two areas i and j (\(f_{i}\) and \(f_{j} )\), for it allows us to highlight the interrelationships that involve small areas, which are often hidden when one uses the largest work force or the sum of the workforce of the two areas.

The dimensions of this dissimilarity matrix D depend on the number of municipalities grouped or paired following the light intensity parameters applied in the first step. We reduce the matrix dimensions from its original municipality-to-municipalityFootnote 11 form for the entire country by adding the commuting flows and the resident labor force of the municipalities that are merged in the first step into a single spatial unit. Thus, for instance, if municipality A and municipality B share a light continuum and make up a single area AB in the first step, on the one hand, the commuting flow from AB to municipality C will simply be the sum of the commuters of municipalities A to C and the commuters from B to C. On the other hand, AB’s resident labor force is the sum of resident labor force in A and the resident labor force in B.

After computing the dissimilarity matrix, we apply a hierarchical cluster procedureFootnote 12 to agglomerate all spatial units that share a considerable level of commuting flows into the resulting functional regions. By doing so, a challenge arises. According to Kropp and Schwengler (2016), although “…bidirectional commuting flows are the most suitable basis for the delineation of Labour Market Areas” (p 431), there is a lack of theoretical arguments to support the (otherwise arbitrary) choice of a threshold value to determine which spatial units merge or do not merge into functional areas.

Just as the choice of a light intensity (digital number) threshold in the first step proves to be fundamental, picking the commuting (dissimilarity score) threshold is not inconsequential, either. As Duque et al. (2007) point out, statistical inference based on regions is strongly affected by aggregation problems, such as the ecological fallacy (Robinson 1950), the modifiable areal unit problem (MAUP) (Openshaw 1977), or spatial aggregation bias (Viñuela et al. 2014). These problems are known to have led researchers to wrong conclusions, particularly when the elements grouped within the same region are highly divergent and the distribution of their attributes is asymmetrical.

Note that the latter condition is remarkably relevant in our case, because our regionalization method clusters initial areal units (municipalities) that share frequent interactions, but disregards whether these areas are homogeneous or similar regarding any demographic or economic feature. Although it may be argued that frequent interactions could induce homogeneity among the areas in the long run, the outcome of our method most likely consists of functional regions whose elements are essentially divergent in many ways. This very difference encourages the daily displacement of people who seek job opportunities, amenities, or services that are not available in the municipality where they reside. Considering this, the conclusions of any applied research based on our results are decidedly determined by the way municipalities are arranged into functional regions, and this arrangement, in turn, depends on the commuting threshold we establish.

Against this background, to test the robustness of our results and somehow diminish the risk of any spatial aggregation bias, we use a wide range of threshold values to analyze marginal changes in the composition of the functional areas. Most importantly, we focus on the distribution of the dissimilarity coefficients, to set threshold values following standard statistical significance levels, such as \({\text{threshold}} = \mu + 1.96\sigma\), where µ and σ represent the average and standard deviation, respectively (see Sect. 4).

4 Results: the cases of Chile, Colombia, and Mexico

This section describes the most important results for each country. Figures 3, 4 and 5 show the functional territories identified in the three countries. The functional territories that group two or more municipalities are colored. Mexico has the highest number of functional territories determined in the empirical exercise (Fig. 3), owing to its high spatial fragmentation and population. There is a contiguity restriction, which means that in addition to commuting rates and light intensity thresholds, only municipalities that share a border are grouped. From a total of 2458 municipalities in Mexico, 1534 functional territories conformed, while 356 group two or more municipalities. The functional territories that group more municipalities are not necessarily the most populated areas. The functional territory grouping more municipalities is the territory of Mexico (88 municipalities with 22,145,904 inhabitants), followed by Puebla (53 municipalities with 2,890,331 inhabitants), Oaxaca (27 municipalities with 648,081 inhabitants), Monterrey (21 municipalities with 4,158,719 inhabitants), Orizaba (14 municipalities with 451,259 inhabitants), Mérida (14 municipalities with 1,031,818 inhabitants), and Guadalajara (13 municipalities with 4,576,976 inhabitants) (Please see “Description of Territories” in the Additional file 1 for a more detailed depiction of the functional territories in the most densely populated areas of each country).

Fig. 3
figure 3

Functional territories in Mexico. The figure describes the functional territories for Mexico. Those territories that grouped two or more municipalities are coloured. The territory that grouped more municipalities is Mexico with 88

Fig. 4
figure 4

Functional territories in Colombia. The figure describes the functional territories for Colombia. Those territories that grouped two or more municipalities are coloured. The territory that grouped more municipalities is Bogota with 23

Fig. 5
figure 5

Functional territories in Chile. The figure describes the functional territories for Chile. Those territories that grouped two or more municipalities are coloured. The territory that grouped more municipalities is Santiago with 48

Meanwhile, Colombia has an extremely complex topography, especially in the Andean region (where most of the population lives). Owing to this geographic conditions, but also to its political and institutional history (Acemoglu et al. 2012, 2013; Fergusson et al. 2017), the country has historically lacked a good transportation network and so, the connection is still difficult between regions, cities and the main ports on the Pacific and Atlantic coasts. This explains the emergence of several cities that play a dominant regional role but that are relatively isolated from each other in what has been called a “System of islands” (Samad 2012).

Under these conditions, it is not surprising to find that most of the functional territories are made up of only one or a few municipalities, especially in areas far away from the main cities. These geographic conditions may well explain why we find some fragmented functional territories, where one or many of the municipalities that make up the region are not spatially contiguous. For instance, the travelling time (effective distance) from a municipality to a non-contiguous municipality can be lower than the travelling time to the municipality in between, thereby theoretically augmenting commuting flows, if the altitude gap between them is lower or if the road between them is in a better condition. Figure 4 shows the functional territories in Colombia. From a total of 1126 municipalities, 370 are grouped in territories of two or more municipalities. The largest one in the country is Bogotá (23 municipalities with 8,025,826 inhabitants), followed by Medellín (17 municipalities with 3,622,659 inhabitants), Barranquilla (14 municipalities with 1,974,143 inhabitants), Sogamoso (11 municipalities with 176,024 inhabitants), Cali (10 municipalities with 2,807,908 inhabitants), and Tunja (9 municipalities with 212,793 inhabitants), inter alia.

On the contrary, Chile is a highly spatially concentrated country. The capital region of Santiago at the center of the country concentrates more than 40% of the total population in 2% of the national territory.Footnote 13 The urban system is shaped by approximately 200 urban areas and more than 36,000 dispersed localities with less than 3000 inhabitants at the latest available census of 2002. Urban primacy is high and only three cities have more than 300,000 inhabitants, namely, Santiago, Valparaiso, and Concepción, all of them concentrated in the dense center-south of the country. Meanwhile, there are isolated urban and rural areas in the north of Chile with high concentration of mining activities. Big mining exploitation centers also are captured by night lights, which leads us to drop many lit areas in the north. Therefore, according to this geography, we expect to find functional areas with greater spatial extension in the center of the country and near big cities, but also important fragmentation in the most remote areas in the extreme north and south of the country. Figure 5 displays the functional territories identified in Chile. From a total of 346 municipalities in Chile, 135 functional territories are created. The largest functional territory in terms of population and number of municipalities is Santiago (48 municipalities with 5,944,318 inhabitants), followed by Concepción (15 municipalities with 1,104,630 inhabitants), Valparaiso (10 municipalities with 889,999 inhabitants), and Rancagua (9 municipalities).

4.1 Results and robustness check

In this subsection, we show the sensitivity of the proposed method with respect to alternative cut-offs for light intensity as well as for commuting rates for each country. Table 2 describes the sensitivity analysis of identifying functional territories to different thresholds of night light intensity and commuting rates for Mexico, Colombia, and Chile. The table describes the results of the sensitivity analysis in terms of: (a) the changes in the number of functional territories; (b) the percentage of municipalities grouped in functional territories; and (c) the percentage of population grouped in these functional territories.

Table 2 Sensitivity analysis

We present the results using 12, 22, 35, and 50,Footnote 14 as our reference digital number values. The last row of each panel in Table 2 shows the resulting functional territories without considering the night lights, that is, when functional territories are identified only by commuting rates. The different commuting rate thresholds are presented in the columns. Due to the fact that geographic differences between countries lead to differences in the average commuting rates in each country, the commuting rate threshold chosen for each country is a similarity score of μ + 1.96σ. However, for comparison purposes, we also present the results with 1%, 2%, and 10% commuting rate thresholds.

Considering that the first step of the method is the union of municipalities sharing a lit area, the commuting rate impacts the results after the light intensity threshold has been chosen. Thus, when the light intensity threshold is high (low), a small (large) number of municipalities is grouped in the first step. For this reason, the more municipalities are grouped in the first step; the fewer municipalities are grouped in the second step, independent of the commuting rate threshold. Furthermore, the higher the commuting rate threshold, the fewer municipalities are grouped in the second step, and therefore, there is greater spatial fragmentation (more functional territories with only one municipality).

We illustrate these results for the case of Mexico (Table 2a). The number of functional territories increases considerably when a higher commuting rate is set. With a low light intensity threshold of 12, the number of functional territories increases from 603 to 1507 when the commuting rate threshold varies from 1 to 10%. When no lights are used, the number of functional territories increases almost 2.8 times, from 738 with a commuting rate of 1% to 2042 with a 10% commuting rate. This increase in the number of functional territories means that there is less grouping of municipalities with a commuting rate threshold of 10% (28% percent of municipalities are grouped into functional territories) than with a commuting rate of 1% (89% of municipalities are grouped into functional territories). Incorporating night light information into the procedure increases the proportion of municipalities that are grouped into functional territories: with a 10% commuting rate, the proportion of municipalities increases from 28% with no lights to 34% with a light intensity of 50, and to 48% with a light intensity of 12. However, the sensitivity of these results decreases when using a lower commuting rate. The same pattern is observed for the population grouped into functional territories.

Figure 6 describes the sensitivity analysis of the number of functional territories to different commuting rates and night light intensity thresholds. As stated in the above paragraph, it is clear for the three countries that when the commuting rate threshold decreases, the number of functional territories also decreases. In addition, the lower is the light intensity threshold (i.e., the greater the lit area), the larger is the number of functional territories. However, these differences are reduced significantly when a low commuting rate threshold is selected.

Fig. 6
figure 6

Sensitivity analysis by dissimilarity thresholds and night light intensity

The vertical red line in each graph shows the commuting rate threshold chosen for each country for the exercise. At these values, the light intensity thresholds make a difference in the number of conurbated areas identified in the first step. In particular, the size of big agglomerations within countries remains stable.

To illustrate this point, Table 3 describes the statistics of the commuting rate of conurbated areas to light intensity thresholds by country. On the one hand, the term “conurbated” describes the municipalities that are grouped into a functional territory with each one of the light intensity thresholds presented in the table. On the other hand, the term “not conurbated” is used to describe municipalities that are not grouped with any other at that stage of the method. There is an important difference in the commuting rate between conurbated and not conurbated areas for all the different commuting rate thresholds. This difference is greater for Chile, which has a higher commuting rate in conurbated areas (2.6%) in the case of light intensity of 12 than Colombia (2%) and Mexico (0.6%). Nevertheless, Colombia has a higher commuting rate for conurbated areas (4.1%) with a light intensity threshold equal or greater than 22 than Chile (3.9%) and Mexico (1.3%). With the higher light intensity threshold (50), this difference in the commuting rate of conurbated areas increases. Colombia has a 9.1% average “intra-urban commuting rate” versus 4.4% for Chile and 2.1% for Mexico. Despite the fact that Mexico reaches a maximum of 63.3% in the commuting rate inside the territory, which is greater than Colombia and Chile, the average is lower than in the other two countries. In addition, the dispersion or variability in the commuting rate in relation to the mean inside the conurbated area is greater in Mexico and lower in Colombia (Table 3).

Table 3 Statistics of commuting rates by light intensity thresholds

Although our choice of an appropriate light threshold for the delimitation of conurbations seems to be an arbitrary decision, some alternative measures can inform this process. Following Kropp and Schwengler (2016), we add the modularity measure to the statistics discussed above to show whether or not the different thresholds of light lead to clusters whose links offer a better description of the functional relationships of the municipalities than the expected link values if the network were random (Rapoport 1957; Watts 2003). This measure varies between 0 and 1, where Q = 0 indicates that clustering is no better than random division, while Q = 1 indicates that clustering is better than a situation in which all are grouped into a single municipal region.

Thus, modularity provides complementary evidence that the conurbations are approximated correctly (see Table 4). For example, in the case of Chile, the political–administrative grouping has a structure with Q = 0.59, but the grouping of those municipalities in conurbations increases the modularity to Q = 0.7. In the case of Colombia, the modular structure of administrative units seems to be a better option, and for Mexico the modularity increases with the different light thresholds, which is in contrast to the political–administrative grouping (Q = 0.807), reaching the highest value with intensity of 12. For this case, intensity of 50 offers a more stable approximation for the size of large agglomerations.Footnote 15

Table 4 Modularity of method’s first step

One of the validation criteria of a functional delimitation is reached in the first stage of the method. Internal perfection is achieved, since night light data capture the extension of large metropolitan areas with a high flow of commuting. The contribution of the second step is oriented to account for interactions with low-density municipalities not captured in the first step. We do so, because these municipalities do not project a quantity of light that is intense enough to consider them as conurbated areas, rather than urban agglomerations with a more local outreach.

5 What next?

Functional territories are becoming an increasingly popular resort for public policy design, as they capture interdependencies between municipalities and approximate the location of the resident population’s daily activities. The scope of functional territories addresses important border-transcending issues, such as urban sprawl, uncoordinated land-use planning, environmental sustainability, and the supply of specific public utilities (Foster 2001; Yuill et al. 2008; OECD 2013).

Other works highlight the importance of functional-area scope for public policy in developed countries. Defining functional regions as spatial units has been useful for calculating the demand for public utilities, such as road networks and community college infrastructure in Canada (Munro et al. 2011). Moreover, municipally fragmented land-use governance has led to sharper, uncontrolled, and unplanned urban expansion of UK metropolitan areas, suggesting that space for regional concentration would encourage coordinated land-use planning (Carruthers 2003). These regions work as productive and innovative clusters by internalizing economic spillovers. Development policies should be conducted on a regional basis and not be assessed as rural versus urban.

Specifically, in the countries studied, our method contributes in various ways to previous definitions of functional territories. A former definition of functional territories in Colombia is provided by Duranton (DNP 2015), and it currently plays a pivotal role in a broad set of policies, including productivity-enhancing, connectivity, and land-use policies. His approach delineates metropolitan areas by iteratively aggregating municipalities into clusters according to their labor-commuting flows. While Duranton’s system of cities groups only 113 (10%) municipalities,Footnote 16 our method results in 370 (33%) municipalities grouped in functional regions, many of which are rural or mid-size municipalities whose interactions, nonetheless, are intense enough to cluster them by pairs or in small functional regions of up to five elements (21.4%). In this fashion, it would help policymakers to recognize not only the interactions that take place inside metropolitan areas or the system of cities, but also those that take place in smaller and more rural functional regions.

For the case of Chile, most studies and policies regarding functional areas rely on labor-commuting patterns from the population census of 2002, and a question to identify commuting was not included in the most recent census in 2017. Even recent innovative methods rely on labor-commuting flows from 2002 (Casado-Díaz et al. 2017). Therefore, the present study provides updated functional areas that are not only based on commuting flows, but that are also relevant for scholars and policymakers interested in both rural and urban functional territories. This is an increasing concern in the literature on agricultural economics (Berdegué et al. 2014b, 2015; Soto et al. 2018).

In the case of Mexico, we believe that our method brings substantial benefits compared with the method currently used by the national government to delineate functional territories. The former consists of delineating isochrones around main urban areas, the length of the isochrones being artificially drawn (1 h around metro areas, 40 min around intermediate cities, and 20 min around small urban centers) to cover the whole Mexican territory (Amador and Vergara 2016). Thus, being a mechanical exercise, the information of interactions between municipalities, which is captured through our method, is lost.

The results of this investigation provide input for a broader research agenda that aims to understand how the evolution of rural–urban linkages in terms of (i) labor market diversification, (ii) agri-food systems, and (iii) urbanization patterns lead to economic growth in the defined functional territories (Berdegué et al. 2014a; Berdegué and Soloaga 2018; Fergusson et al. 2018; Soto et al. 2018). We consider that the definition of functional regions is a first step that allows us to develop a proper understanding of the role of these three dynamics, being well aware that their influence seldom matches the geography of political–administrative units.

6 Concluding remarks

Attempts to incorporate the concept of functional areas into political and research agendas are still incipient in developing countries. Most of these studies, particularly in Latin America, rely on outdated censuses, and therefore, the lack of data is a limiting factor even for the identification of metropolitan areas or conurbations. Consequently, this study proposes a novel approach for the delimitation of functional spatial units, or functional territories, using satellite imagery in conjunction with commuting data. The purpose of our method is to use night light data to identify urban agglomerations, which in turn allows us to group municipalities before clustering them by using commuting data. Our approach is not intended to measure the extent of urban areas, but rather to construct “functional territories,” which we define as spatial units with more economic interaction inside than outside the area. We describe our method using the cases of three developing countries, namely Mexico, Colombia, and Chile.

The resulting functional territories identified with our method can eventually be a valuable tool for public policies or for further research. As they capture interdependencies between municipalities and approximate the location of the resident population’s daily activities, they can be used as spatial units to calculate the demand for infrastructure for public utilities (Munro et al. 2011), to design land-use and housing policies with a larger regional scope that considers those interdependencies (Cheshire and Hilber 2008), or to foster productive and innovation clusters (Partridge and Olfert 2010).

Functional territories as we define them are highly susceptible to MAUP and other aggregation problems (Duque et al. 2007), since they do not cluster spatial units based on their homogeneity but rather on the intensity of their interactions. To test the robustness of our results and to diminish the risk of any spatial aggregation bias, we use a wide range of threshold values to analyze marginal changes in the composition of the functional areas. Most importantly, we focus on the distribution of the dissimilarity coefficients under different scenarios. For a more robust solution, we would need more disaggregated information, which would be useful for reducing spatial aggregation bias and analyzing interactions over space more meticulously.

Overall, regarding Mexico, we believe our method is far superior to that currently used by the Mexican government to delineate functional territories, which basically consists of delineating isochrones around main urban areas.

Regarding Colombia, a broad set of policies, including productivity-enhancing, connectivity, and land-use policies, often consider the metropolitan areas defined by Duranton, as described in Sect. 5 (DNP 2015). Considering this, our results may help to widen the scope of the research and help these policies to capture not only the interactions that take place inside metropolitan areas, but also those in smaller and more rural functional territories.

Finally, for the case of Chile, the lack of updated official census information on commuting flows presents a challenge for the delimitation of functional spatial units. The functional territories presented in this study provide a potential solution to this problem, which may have important implications for scholars and policymakers.