## Abstract

We take advantage of a new data set on Belgian cities to test random growth, that is, Gibrat’s law. This unique data set provides annual population estimates for all Belgian municipalities (2680 cities) from 1880 to 1970. The use of panel data methodology and unit root tests can provide a precise test of Gibrat’s law (a unit root is equivalent to random growth). We run both time series and panel data unit root tests, thus obtaining strong support for random growth in the long term. Results hold when allowing for the presence of one and two structural breaks in the mean, with the timing of the breaks coinciding with some major historical events, such as the World Wars and the economic crisis of 1929–1933.

### Similar content being viewed by others

Avoid common mistakes on your manuscript.

## 1 Introduction

Urban growth literature has a long tradition. Why some cities grow while others decline is still an open question, although several theoretical explanations have been proposed. These theories can be summarised into three main drivers of growth (Davis and Weinstein 2002): the existence of increasing returns to scale, the importance of locational fundamentals, and random growth.

Each of these drivers of urban growth involves different theoretical mechanisms. The existence of increasing returns suggests the presence of endogenous mechanisms in city growth that can lead to multiple equilibria (Davis and Weinstein 2002; Bosker et al. 2007), depending on the initial conditions. Locational fundamental theory highlights the role played by geographical characteristics: the presence of a natural harbour, a specific climate, or access to the sea, among many other physical characteristics, can determine cities’ populations (for instance, Ellison and Glaeser (1999) stated that natural advantages can explain at least half of the observed geographic concentration in the US). Finally, random urban growth postulates that population growth in cities is a random variable.

Studies testing the influence of increasing returns to scale and locational fundamentals have usually relied on parametric (cross-sectional or panel data) growth regressions, applying an instrumental variable approach in most cases. The latest advances in this literature have come from the use of plant‐level data (Holmes and Stevens 2002; Barrios et al. 2006) and case studies using an identification strategy of instruments that reveals the influence of some historical events on cities’ growth path (e.g., Bleakley and Lin 2012; Garcia-López et al. 2015). However, most of these studies have adopted a short-term perspective, and even panel data analyses have considered a few decades at most.

The approach taken in the random growth literature is different. First, from the theoretical point of view, random growth can only hold as a long-run average,^{Footnote 1} while the influence of other factors, like locational fundamentals and increasing returns, may change (or even disappear) over time. With random urban growth, the growth process of cities tends to be multiplicative and independent of their initial size, a proposition that has become known in urban economics as Gibrat’s law.^{Footnote 2} Several theoretical models (Gabaix 1999; Duranton 2007; Córdoba 2008) were developed to explain the fulfilment of Gibrat’s law in the context of external urban local effects and productive shocks, associating it directly with an equilibrium situation. Therefore, city-level variables can explain temporal variation in growth rates across cities, but random growth theory provides an appropriate explanation for the long-term growth.

Second, on the empirical side, although seminal contributions (e.g., Eaton and Eckstein 1997) have also used parametric growth regressions to test Gibrat’s law, since the 2000s, several studies have proposed alternative methodologies to parametric growth models. González-Val et al. (2014) reviewed this literature, concluding that most studies today use nonparametric estimates of urban growth or unit root tests. Nonparametric estimates of growth have become popular in this literature, providing estimates of growth that vary with the initial population over the entire distribution of city sizes. However, these kernel regressions estimate the unconditional relationship between growth and size; city and time fixed effects and any other control variables are omitted. Thus, authors have carried out nonparametric analyses for cross-sectional data (Eeckhout 2004) as well as for a pool of growth rates from different time periods (Ioannides and Overman 2003; González-Val 2010).

The use of the panel data methodology and unit root tests in the analysis of urban growth, first suggested by Clark and Stabler (1991), can provide a more precise test of Gibrat’s law. This idea was emphasized by Gabaix and Ioannides (2004, p. 2358), who expected “that the next generation of city evolution empirics could draw from the sophisticated econometric literature on unit roots.” Recently, new methods have been proposed to test random growth using unit root tests. Lalanne and Zumpe (2019, 2020) apply an integrated model selection/unit root test protocol with three different model specifications (pure random growth, random growth with drift, and random growth with drift and trend) to a large sample of high-quality French decennial census data, on how the rejection of the random growth hypothesis accounts for less than one third of tested cities.

However, some empirical limitations have reduced the spread of these techniques. While several papers applied panel data unit root tests to analyse urban growth (e.g., Black and Henderson 2003; Resende 2004; Henderson and Wang 2007; González-Val and Lanaspa 2016), the list of studies looking for unit roots in individual time series of cities’ populations is quite short. Why? Unit root tests need large sample sizes (at least 40 observations) to have reasonable power (Clark and Stabler 1991). However, long time series of year-by-year city populations are usually not available, and studies on the temporal evolution of city sizes have considered decennial census data in most cases. Therefore, the lack of *annual* data for a sample of cities over a long time period on a consistent basis has limited the use of unit root testing in empirical work.

To our knowledge, only five studies consider annual city populations to test Gibrat’s law using unit root tests: Clark and Stabler (1991), Resende (2004), Sharma (2003), Bosker et al. (2008) and Chen et al. (2013). Clark and Stabler (1991) used data on the seven largest cities in Canada from 1975 to 1984 (10 temporal observations by city), while Resende (2004) used a panel data set with annual data for 497 municipalities in the state of São Paulo (Brazil) for the 1980–2000 period. Sharma (2003) considered a sample of 100 Indian cities for the period 1901–1991 (90 years), Bosker et al. (2008) used a dataset of 62 West German cities from 1925 to 1999 (except for five missing years during the Second World War), and Chen et al. (2013) considered 210 Chinese cities from 1984 to 2006 (23 temporal observations).

Although the efforts of these authors to obtain annual city population data and exploit the properties of the unit root tests fully are worthy, these studies still show an important limitation: they focused on the largest cities. Nevertheless, some studies have confirmed the different patterns of growth of small cities (Partridge et al. 2008; Devadoss and Luckstead 2015) and, thus, the behaviour of the largest cities cannot be extrapolated to the whole distribution of cities. Another reason to consider all cities is that some studies found differences in growth patterns between different regions within the same country. Lalanne (2014) studied the hierarchical structure of the Canadian urban system, splitting the territory into two parts (east and west), thus allowing for the identification of different dynamics.

In this paper, we take advantage of Ronsse and Standaert’s (2017) new data set of Belgian cities. This unique data set provides annual estimates of the population for all Belgian municipalities from 1880 to 1970. Thus, it allows us to carry out a robust long-term analysis of urban growth because the time dimension is long (90 temporal observations by city) and, at the same time, it contains information for all cities, covering the whole city size distribution. Therefore, as far as we know, this is the most comprehensive test of Gibrat’s law using unit root tests ever conducted.

Moreover, the Belgian case is interesting because of some specific historical characteristics of the country. As a relatively young country on the European continent, the Belgian state came into existence following a liberal revolution in 1830. Set up as a parliamentary democracy, headed by a monarch with limited powers, the Belgian state quickly became a haven for political liberalism in 19th-century Europe. At the same time, however, the young nation wanted to ensure that it had the most up-to-date information about the population living within its borders. Following the newest scientific methods, the state apparatus created a highly developed statistical department, which, whilst not unique, was well ahead of its time. The continuous efforts of this department have led to a richness of statistical data spanning the entire history of the country.^{Footnote 3}

When studying the trends of population growth in this small state, surrounded by giants, it is also important to note that Belgium was the first industrialized country on the European continent. A combination of technological knowhow, labour, capital and readily available natural resources led to a quick-paced industrialization process, initially concentrated in the city of Ghent and in the southern part of the country (Caestecker 2015, pp. 101–103). Only after the Second World War did the industrial centre of gravity move to the north, with the arrival of many multinational companies that wanted to make use of the vicinity of sea harbours and the excellent road infrastructure of the region (Witte and Meynen 2006; Ryckewaert 2011).

Akin to this industrialization, but equally motived by political interests and goals, Belgium also became the centre of an expansive railway network from the late nineteenth century onwards. The construction of the railway had already begun in the 1830s, but the connections were initially limited to those between major urban and industrial hubs. From the 1880s onwards, however, there was a massive effort to expand the railway network to even the smallest towns. In a country of only 30,528 km^{2}, the main rail network grew from 3000 km in 1870 to almost 5000 km in 1910. Concurrently, a 3000 km network of light rail was established that “meander within the landscape, from village to village, collecting as many intermediary passengers and goods as possible” (De Block and Polasky 2011, p.318).

The high degree of connectivity resulting from this extensive network – as well as a system of cheap train tickets for workers – ensured a steady flow of labour from the still densely populated countryside to the industrial centres of the country, without the large-scale urbanization – and accompanying politicization of the labour force – that characterized the industrialization process in Belgium’s neighbouring countries (De Meulder et al. 1999). Before the end of the twentieth century, more than one in three Belgians commuted to and from work every day (De Decker 2011).

These factors affected the distribution of economic activity and population in Belgium during the course of the twentieth century, and set the grounds for a new long-term equilibrium of the distribution of population. However, as both industrialization and completion of transport infrastructures took place at the beginning of our sample period (in the late nineteenth and early twentieth centuries) that covers almost one century, we expect to capture not the short-term transitional periods of higher or lower growth, but rather the long-term growth that may follow a random growth pattern.

The remainder of the paper is organized as follows. In Sect. 2, we describe the data set used. Section 3 explains the different unit root tests carried out, distinguishing between time series analysis of unit roots allowing for structural breaks and panel data unit root tests, and presents the main results. Section 4 concludes.

## 2 Data

As far as we know, only a few studies have provided long-term estimates of annually and spatially disaggregated populations at the city level: Sharma (2003), Bosker et al. (2008) and González-Val and Silvestre (2020), for Indian (100 cities, 1901–1991), West German (62 cities, 1925–1999 period with some gaps) and Spanish cities (49 capital cities, 1900–2011), respectively.

In this paper, we use the new data set by Ronsse and Standaert (2017). They constructed a data set of 2680 Belgian municipalities for the period 1880–1970. Their dataset ends in 1970, as this is the last decade in which administrative borders remained relatively constant. Specifically, in 1970 and 1971, the number of municipalities first dropped by 200. This decrease was followed by the main administrative redrawing of the municipalities in 1976, which reduced the number of municipalities by three quarters. In all, the total number of municipalities dropped from 2675 in 1970 to a mere 596 six years later (Vrienlinck, 2000). Further redrawing of municipal maps followed in 1983 and 2019 and resulted in the current total of 581 municipalities, although this value is set to change again in 2025. Given the impact of these changes on the continuity of their dataset, the authors chose 1970 as the final year.^{Footnote 4}

To compose this data set, they combined the population census data, which are collected every ten years, with the yearly data on births, deaths and migration from the city population registers (the *movement* data). The less centralized nature of the latter, in combination with a tendency to underreport outward migration, means that these series cannot easily be combined without creating breaks in the data with every new census. This is illustrated quite well in Fig. 1, in which the dotted red lines show what the population estimate would be if we added the cumulative changes to the population to the last set of census data. The population in Brussels is overestimated by as many as a 100,000 people in 1900. At other times, especially in the later years of the data set, the quality of the population registers can be much higher.

To reconcile the two data series, Ronse and Standaert (2017) used a state-space approach, which models the *mouvement* data as a noisy signal of the true change in the population growth. As the *mouvement* is collected by each city individually, the noisiness of its signal is allowed to differ for each city. Furthermore, the model incorporates information on the changes to the administrative borders of the cities, allowing the population data to change more drastically in those years (see e.g. Brussels in 1921 in Fig. 1). The results of the models are also shown in Fig. 1, with the thick black lines showing the estimated population and the blue shaded area its 95% confidence interval. The estimated population data clearly follow the change indicated by the *mouvement* data as much as possible while remaining consistent with the census data.

As we noted, the period of analysis in the dataset was chosen so that the administrative borders remain consistent. Included in the sample, however, is the first wave of administrative changes of 1964, that reduced the number of municipalities from 2675 to 2586. In addition to these large-scale changes, information from Vrienlinck (2000) was used to identify smaller, ad-hoc changes to the administrative borders of municipalities. As a robustness check, cities with more than a 30% change in land area in the period considered were excluded from the analysis.

For illustrative purposes, Table 1 shows the number of cities and the descriptive statistics in census years. The minimum values of each decade indicate that we are considering all municipalities, without size restrictions, even the smallest units. Although their urban character might be debatable, Eeckhout (2004) suggested considering the whole distribution. The mean population grows over time and, at the same time, the number of cities remains stable, which points to an urbanization process. Moreover, although the standard deviation increases from 1880 to 1910, indicating increasing inequality in city sizes, it remains quite stable in the last decades, which suggests stability in the city size distribution.

Municipalities comprise the country’s total land area and, therefore, the entire population. During the period considered (almost a century), there was an important increase (71.2%) in the Belgian total population in our sample, from 5,517,017 in 1880 to 9,443,985. One might expect that such a huge increase also generated important changes in the city size distribution. Figure 2 shows the empirical density functions for three representative periods (1880, 1930 and 1970) estimated using adaptive kernels for our sample of all Belgian municipalities. We define the relative size of a city as the quotient between the city’s population and the contemporary average population. The graph shows the distribution of these cities’ relative sizes in log scale, with the zero value in the x-axis representing medium-sized cities.

The shape of the city size distribution changed dramatically from 1880 to 1970. In 1880, we can observe a very leptokurtic distribution with a great deal of density concentrated in the central values. However, by 1930, the distribution had lost kurtosis and the concentration had decreased. In 1970, we find a more uneven distribution with heavier tails than in previous periods, indicating the presence of more small and large cities than in earlier years. The temporal evolution of the city size distribution suggests a divergence pattern in the growth of Belgian cities, thus apparently discarding the idea random growth.

## 3 Methodology and results

### 3.1 Unit roots in city sizes

Our basic hypothesis for the long-term growth of Belgian cities is random growth (Gabaix and Ioannides 2004; González-Val et al. 2014). As mentioned above, random growth can hold as a long-run average, while the effect of other factors may change or dissipate over time. Traditional theoretical models of random growth (Champernowne 1953; Simon 1955) are based on stochastic growth processes and probabilistic models. Recent urban economic theories (Gabaix 1999; Duranton 2007; Córdoba 2008) include economic factors driving random shocks (e.g., external urban local effects or productive shocks), and these models are able to reproduce two empirical regularities that are well known in urban economics: Zipf’s and Gibrat’s laws (or the rank-size rule and the law of proportionate growth, respectively).

We follow the methodology proposed by Clark and Stabler (1991), who suggested that testing for random growth is equivalent to testing for the presence of a unit root. Starting from a simple autoregressive (AR) growth model, they assumed that the relationship between the size of a city in time period \(t\) and that in time period \(t - 1\) is

where \(S_{it}\) is the city share of city \(i\) defined as the quotient derived from dividing the city’s population (\(Pop_{it}\)) by the contemporary total population of the country, \(S_{it} = {{Pop_{it} } \mathord{\left/ {\vphantom {{Pop_{it} } {Country{\text{ pop}}_{t} }}} \right. \kern-0pt} {Country{\text{ pop}}_{t} }}\) because, from a long-term temporal perspective, it is necessary to use a relative measure of size (Gabaix and Ioannides 2004). \(\rho_{it}\) is the growth rate of city \(i\) over the period \(t - 1\) to period \(t\). This growth rate can be decomposed into two (Clark and Stabler 1991) or three components (Bosker et al. 2008): a random component, a non-stochastic component relating the current growth rate to a (possibly time-varying) constant and past growth rates, and the initial city size. Then, after some algebra, Clark and Stabler (1991) obtained the following expression:

where \(s_{it} = \ln \left( {S_{it} } \right)\) is the log-city share of city *i* in year *t*, \(\Delta s_{it} = s_{it} - s_{it - 1}\), \(c_{i}\) is a constant, \(\beta_{ij}\) is a parameter measuring the influence of past growth rates on current city growth and \(k\) is the number of lags added to ensure that the residuals, \(\varepsilon_{it}\), are Gaussian white noises. \(\Theta_{i}\) is the key parameter that captures the effect of the initial city size on growth. Random growth would imply \(\Theta_{i} = 0\), meaning that the growth of a particular city does not depend on its initial city size. Then, the city share would be a non-stationary time series, and any sudden shock would have permanent effects on the long-run level of the population of the city (Davis and Weinstein 2002). This shows that testing for random growth (Gibrat’s law) is equivalent to testing for a unit root in city sizes. Evidence supporting a unit root (if \(\Theta_{i}\) is not significant) means that city \(i\)’s growth rate is independent of the initial size. By contrast, when \(\Theta_{i} < 0\), the path of city \(i\) will be a stationary process (mean reversion). By using Eq. (2), Clark and Stabler (1991) apply the standard Dickey and Fuller (1979) t-statistic, failing to reject random growth for the seven largest cities in Canada from 1975 to 1984.

To test for the presence of unit roots, we run the Augmented Dickey–Fuller (ADF) test (Dickey and Fuller 1979, 1981).^{Footnote 5} The ADF test for non-trending data is carried out by running Eq. (2). Following Ng and Perron (1995), we choose the optimal* k* using a “general-to-specific procedure” based on the t-statistic. The null and alternative hypotheses are, respectively, \(H_{0} :\Theta_{i} = 0\), \(H_{A} :\Theta_{i} < 0\). If \(\Theta_{i}\) is found to be equal to 0, then the city share series follows a random walk and, on the other hand, if \(\Theta_{i}\) is found to be significantly smaller than 0, the city share is stationary around \(c_{i}\).

Table 2 (second column) reports a summary of the results of the individual city unit root tests. We find that the null hypothesis of a unit root in the city share is not rejected for most of the cities in the sample. In particular, for 207 of the 2680 Belgian cities (8%), the unit root is rejected at the 10% level, thus strongly supporting the random growth hypothesis. However, a possible concern with these results is that the overall non-rejection of the unit root hypothesis may be because the standard ADF tests are biased (Perron 1989). It is possible that what we identified as a unit root process could be better modelled as a stationary process around highly permanent shocks, especially when such a long period is considered. In their study of German cities, Bosker et al. (2008) addressed this issue by allowing for the presence of a one-time structural break when testing for unit roots, using the unit root test suggested by Perron and Vogelsang (1992). Here, we follow the same approach.

Following Bosker et al. (2008), we estimate additive outlier (AO) models, allowing for a sudden change in mean (crash model). The AO model is appropriate when the change is assumed to take effect instantaneously (for instance, because of warfare destruction). This model is estimated by way of a two-step procedure. The first step removes the deterministic part of the series by estimating the regression

where \(DU_{t} = 0\) if \(t \le TB\)(the break date) and 1 otherwise. The resulting residuals are then tested for the presence of a unit root by estimating

where \(\eta_{it}\) is the estimated residual from Eq. (3), TB is the break date and \(DTB_{it} = 1\) if \(t = TB + 1\) and 0 otherwise. Both equations are estimated using OLS for each break year \(TB = k + 2,...,T - 1\), with *T* being the number of observations and *k* being the truncation lag parameter. The null hypothesis of a unit root is rejected if the t-statistic for \(\rho\) is significant. In this case, the city share will be a stationary time series around a structural break. All but one shock (the break) would cause temporary movements of the city’s share. By contrast, if the t-statistic for \(\rho\) is not significant, the city share will be a non-stationary time series and any sudden shock will have permanent effects on the long-run level of the city share.

The results of applying the AO-model to test for a unit root in city shares under the null of a unit root versus stationarity around a possibly shifting mean under the alternative are also summarized in Table 2 (third column). Although the percentages of rejection of a unit root increase slightly, the results do not vary substantially from those of the ADF test. At the 10% level, the unit root null hypothesis cannot be rejected in favour of a stationary city share with a one-time break for roughly 85% of the cities (2274 of 2680 cities). Moreover, the breaks are significant in almost all cases; only for 51 cities is the break not significant. Therefore, evidence supporting random growth persists.

The previous analysis only captures the single most significant break in each city share series. However, since the period considered is quite long (almost one century) and variables rarely show just one break, we also attempt to determine whether the city share series show a double change in the mean. We use the test developed by Clemente et al. (1998), who based their approach on Perron and Vogelsang (1992), but allowing for two breaks. Formally, (3) and (4) change to:

and

where \(DU_{jt} = 1\) if \(t > TB_{j}\) \(\left( {j = 1,2} \right)\) and 0 otherwise. \(DTB_{ijt}\) sets equal 1 if \(t = TB_{j} + 1\) and 0 otherwise \(\left( {j = 1,2} \right)\). \(TB_{1}\) and \(TB_{2}\) are the time periods when the mean is being modified. Like Clemente et al. (1998), we suppose that \(TB_{j} = \lambda_{j} T\) \(\left( {j = 1,2} \right)\), with \(0 < \lambda_{j} < 1\), which implies that the test is not defined at the limits of the sample, and that \(\lambda_{2} > \lambda_{1}\), which eliminates those cases in which breaks occur in consecutive periods. To test for the unit root null hypothesis, Eq. (5) is first estimated using OLS to remove the deterministic part of the variable, and then the test is carried out by searching for the minimal pseudo t-ratio for the \(\rho = 1\) hypothesis in Eq. (6) for all the break time combinations. The null hypothesis of a unit root is rejected if the t-statistic for \(\rho\) is significant. In this case, the city share will be a stationary time series around two structural breaks. Most shocks would cause temporary movements of the city’s share, but two shocks (the breaks) would cause permanent effects. Unlike the situation in which the t-statistic for \(\rho\) is not significantly different from zero, the city share will be a non-stationary time series and any sudden shock will have permanent effects on the long-run level of the city’s share of the population.

We would expect that allowing for the possibility of two endogenous break points could provide further evidence against the unit root hypothesis (Lumsdaine and Papell, 1997; Ben-David et al. 2003), especially because both breaks are significant in most of the cases: only in 45 and 49 cases (of 2680) are the first and the second break, respectively, not significant. However, the percentage of unit roots rejected at the 10% level is lower than that for the one-break test (see Table 2, fourth column). Furthermore, at the 5% and 10% significance levels, the percentage of rejection is even lower than that of the ADF test (no-break scenario). There is no clear pattern in the size of the cities for which the unit root is rejected; their mean population is slightly above the average size for all cities from 1880 to 1890, but it slowly decreases over time, and, over the whole period, these city sizes are on average only 12% below the mean population of all cities. Therefore, on average, these cities tend to be slightly below the mean city size but certainly are not the smallest units in our sample.

Regarding the dates of the breaks, Fig. 3 displays the distribution of the breaks’ timing. The y-axis shows the frequency (i.e., the total number of significant breaks detected) by year (the x-axis). It shows both the one- and two-break cases; only significant breaks are included in the graph. Across the one- and two-break cases, three distinct major events stand out: the First World War, the economic crisis of 1929–1933 and the Second World War. Most distinctly present in the one-break case, the Great Depression of the early 1930s, following the crash of the American stock exchange in 1929, undoubtedly carries with it a lot of explanatory weight (Caestecker 2015, pp. 133–134). As unemployment soared, so did the mobility of those people who were looking for alternative ways to make a living.

A decade earlier, the impact of the First World War is clearly visible in both the one-break and the two-break first-break cases. In Belgium, the onset of the war caused an enormous stream of refugees to flee their home towns and settle in other places, both within the Belgian territory and abroad. Some 1.5 million Belgian refugees left the country to settle temporarily in the Netherlands, France and Great Britain (Amara 2008). During and in the aftermath of the war, there was an enormous drop in the Belgian population, caused by a fall in the number of births and a peak in the number of deaths due to both the war and the Spanish flu that directly followed it (Caestecker and Vanhaute 2011, pp. 94–96). That the break in population growth is only visible in the aftermath of the war is probably related to the fact that, during the war, administrative services were severely disrupted.

Even though there was no internal displacement on such a large scale during the Second World War, the onset and aftermath of the war present a clear break in population growth, visible both in the one-break cases and in the two-break cases’ second break. The onset of the war, however, did cause some people to flee abroad, and, just before the outbreak, many thousands of refugees from Nazi Germany and the occupied territories arrived in Belgium, looking for shelter (Debruyne 2007, pp. 75–76). In the aftermath of the war, as Belgian refugees returned home, thousands of displaced persons found themselves on Belgian territory. Most of them were former prisoners of war, brought to Belgium by the Nazi occupants to perform forced labour (Luyckx 2010). The fast economic recovery of Belgium in the immediate aftermath of the war might also have led to internal migration towards the revived industrial centres, where plenty of jobs were to be found (Witte and Meynen 2006, pp. 35–36).

Two more important moments of change can only be deduced in a meaningful way from the two-break case. In the two-break case, for the first break, we can see a clear break around the turn of the nineteenth century. The last two decades of that century heralded a period of economic crisis, resulting in growing unemployment and poverty. In these turbulent times, thousands of people moved to another town, where they hoped to find a more secure income. Thousands of others wandered around the country, looking for work. Finally, Belgium also experienced a demographic boom at the end of the nineteenth century, due to a spectacular rise in life expectancy (Caestecker and Vanhaute 2011, pp. 89–94). This certainly forms part of the explanation for the sudden break in population growth that we can discern at the *fin de siècle*.

The final break in growth, which we can only meaningfully discern in the two-break case, as the 2nd break, is noticeable around the year 1955. For this change in the pattern, we can find no immediate explanation. The 1950s were a period of slow economic growth in Belgium, characterized by relatively high unemployment. Compared with the following years, the Golden Sixties, the mid-1950s can be seen as a last moment of high mobility before a long decade of relative stability, connected to a boom in economic growth and a historically low unemployment rate (Witte and Meynen 2006).

Finally, we run several robustness checks. First, we consider a sub-sample of cities. The geographic boundaries of cities could change over such a long time period and, thus, in some cases, the city growth or the structural break may only be reflecting the change in city boundaries. We have information about administrative changes to the boundaries of Belgian cities and the size of the municipalities from 1865 to 1970 (every 10 or 15 years). Using this historical information, we can calculate the change in land area by city and, following Glaeser and Shapiro (2003), we can set a threshold for the change in land area and exclude all cities above it. In particular, we exclude all cities with more than a 30% change (positive or negative) in land area, thus reducing the sample size to 2476 cities.^{Footnote 6} This correction eliminates extreme cases in which the city in 1880 is very different from the city in 1970. Then, we rerun all the unit root tests; the results are reported in Table 3. The first conclusion is that the results are quite similar to those shown in Table 2, which indicates that the issue of changing in boundaries was not driving our main results. The second implication of these results is that the support for a unit root in city shares is even stronger when we exclude changing boundary units; although the percentages of unit roots rejected for the no-break scenario test are the same as in Table 2, when one or two breaks are allowed, these percentages decrease slightly to very low figures. For instance, only in 1% of the cases is the unit root rejected at the 5% confidence level under the two-break specification.

Second, we re-run the time series analysis considering the generalised supremum ADF (GSADF) test statistic for explosive behaviour proposed by Phillips et al. (2011, 2015). This recursive test does not assume exogenous structural breaks. For all the cities, the percentage of unit roots rejected at the 5% level increases up to the 36% (976 cities out of 2680). Nevertheless, the main result holds, as we cannot reject a unit root for most cities (64%).

### 3.2 Panel unit root test

Most theories proposed for the underlying mechanisms that govern the city size distribution are dynamic, and they make predictions of how particular cities will behave in a panel. To take full advantage of the panel dimension of our data set, we also test for a unit root in a panel.

Some authors have also tested for the presence of a unit root using growth equations and panel data (Black and Henderson 2003; Resende 2004; Henderson and Wang 2007; Chen et al. 2013). Nevertheless, this approach has some problems and limitations (Gabaix and Ioannides 2004; Bosker et al. 2008). As Chen et al. (2013) point out, panel unit root tests with short temporal dimension can suffer from low power. Although our time dimension is higher than that in previous studies (91 temporal observations for most cities), it may be still a low number, in contrast to the number of cities in the panel.^{Footnote 7}

Our annual data overcome one of the common limitations in this literature, namely the use of census data and decade-by-decade city populations, but an econometric issue persists: the presence of cross-sectional dependence across the cities in the panel can give rise to estimations that are not very robust (González-Val and Lanaspa 2016). Cross-sectional dependence implies that the cities are interdependent. The causes of cross-sectional dependence in the errors can be the presence of common shocks and unobserved components that ultimately become part of the error term, spatial dependence and idiosyncratic pair-wise dependence in the disturbances with no particular pattern of common components or spatial dependence (Baltagi et al. 2007). The econometric literature has established that the panel unit root and stationarity tests that do not explicitly allow for this feature among individuals present size distortions that can lead to misleading inference (Banerjee et al. 2005). Moreover, Baltagi et al. (2007) examined the performance of several panel unit root tests under spatial dependence, and found that tests assuming cross-section independence perform better.

Therefore, following González-Val and Lanaspa (2016), among all tests available in the literature especially developed to deal with this issues (Breitung and Das 2005; Breitung and Pesaran 2008; Gengenbach et al. 2010), we use Pesaran’s (2007) test for unit roots in heterogeneous panels with cross-sectional dependence.^{Footnote 8} The test of the unit root hypothesis is based on the t-ratio of the OLS estimate of \(b_{i}\) in the following cross-sectional augmented Dickey–Fuller (denoted by CADF) regression:

where \(s_{it}\) is again the log-city share of city *i* in year *t*, \(a_{i}\) is the individual city-specific average growth rate and \(\overline{s}_{t}\) is the cross-section mean of \(s_{it}\), \(\overline{s}_{t} = N^{ - 1} \sum\nolimits_{j = 1}^{N} {s_{jt} }\). To eliminate cross-dependence, standard Dickey–Fuller (or augmented Dickey–Fuller) regressions are augmented with the cross-section averages of lagged levels and the first differences of the individual series, such that the influence of the unobservable common factor is asymptotically filtered. The null hypothesis assumes that all series are non-stationary, and Pesaran’s CADF is consistent under the alternative that only a fraction of the series is stationary. Nevertheless, the test does not allow for structural breaks in the series.

Unfortunately, due to computational limitations, we cannot run the test using the full sample of cities; we must restrict our analysis to a sub-sample of the largest cities considering groups from the 50 to the 500 largest cities. Nevertheless, this sample of largest cities represents around two-thirds of the total country population in all years. Black and Henderson (2003) and González-Val and Lanaspa (2016) highlighted one potential issue of sample selection in a dynamic framework: if the sample of cities is defined according to the largest units in the latest year, the analysis may be biased because these are the “winning” cities, namely those that have presented the highest growth rates over time. To deal with this potential problem, we define the groups of cities using two samples including the largest cities in the first (1880) and last (1970) years. Although the latter group may only include winning cities, the former group can also include information about cities that were important in 1880 but declined over time.

Table 4 shows the results of applying Pesaran's panel unit root test.^{Footnote 9} Panel A reports the results for the sample of largest cities in 1880. We find that the null hypothesis of a unit root is not rejected in most of the cases. The unit root is only rejected in two cases, for the groups including the 50 and 100 largest cities in 1880, when three lags and a trend are added. Panel B shows the results using a sample of the largest cities in 1970. In this case, the unit root hypothesis is not rejected in any case, even at the 10% significance level. Overall, both panels present strong evidence supporting a unit root in these samples of large cities, in line with the results obtained using time series analysis in the previous section. As a robustness check, we re-estimated the test using 500 random samples of cities. For each sample, we draw 500 cities without replacement from the full sample of 2680 Belgian cities regardless of their size. We could not reject the null hypothesis of a unit root in any case for any model specification. Again, these results (not shown) strongly support random growth.^{Footnote 10}

Finally, in Table 5 we report the results from the Pesaran’s (2004) test for cross-sectional dependence (CD) for the same subsamples considered in Table 4. Results using the Frees’ (1995) test (not shown) are similar. The null hypothesis of cross-sectional independence is rejected in all cases, confirming the need for using a panel unit root test controlling for cross-sectional dependence.

## 4 Conclusions

We examine urban growth in Belgium from 1880 to 1970 using unit root tests to check random growth in the long term. Our results add to the scarce literature on unit root testing in city sizes (Clark and Stabler 1991; Sharma 2003; Bosker et al. 2008), and, to our knowledge, this is the most comprehensive test of Gibrat’s law using unit root tests ever carried out as we consider annual data for all municipalities.

Using both time series and panel data unit root tests, we obtain strong validation of the random growth hypothesis, that is, Gibrat’s law, which implies that urban growth is independent of the initial city size. This evidence supports a multiplicative growth process of cities in Belgium, and this kind of growth is consistent with many theoretical urban economics models (Gabaix 1999; Eeckhout 2004; Duranton 2007; Córdoba 2008). Nevertheless, even if city shares follow a unit root, this growth process is compatible with a degree of convergence in the evolution of city growth rates; that is, with some kind of mean-reverting component (Gabaix and Ioannides 2004).

The long-term pattern of random growth does not imply that the city size distribution has remained static over the years. On the contrary, a unit root implies that all shocks have had permanent effects on the city share, and, in particular, when allowing for structural breaks, we find that exogenous historical shocks had a permanent effect on city shares: the timing of the structural breaks coincides with some major historical events, such as the World Wars and the economic crisis of 1929–1933.

The alternative explanations for random growth considered in the literature are, basically, locational fundamental theories and increasing returns to scale (Davis and Weinstein 2002). Although our results are not specifically a test of random growth versus locational fundamentals or random growth versus increasing returns to scale, the strong support obtained for random growth clearly cast some doubts on the relevance of the two alternative theories in the Belgian case in the long-term.

This evidence is consistent with previous research for other countries. To give some examples, González-Val et al. (2014) found that random growth (i.e., Gibrat’s law) holds for most cities in the US, Spain, and Italy, and Chauvin et al. (2017) concluded that Brazil and the US both appear to adhere broadly to Gibrat’s law, but China and India do not. As random growth is a long-term pattern, it involves that the city size distribution reaches a dynamic steady state. For this reason, Chauvin et al. (2017) explained that Gibrat’s law holds in Brazil and the US because both are moderately sized places, which have long been largely urban, while China and India are much larger, and many of their cities are newer. In terms of our study using Belgian cities, it is natural that random growth emerges as the pattern of growth in the long-term, considering similar arguments. First, Belgium is a small-sized country. Second, as explained in the Introduction, both the industrialization and the completion of transport infrastructures took place at the beginning of the sample period (in the late nineteenth and early twentieth centuries), implying that urbanization of the country began in the early twentieth century. Third, as in many European countries, most Belgian cities date from ancient times, so the urban system is consolidated.

However, our results raise new research questions. Our analysis captures the long-term pattern of growth, but omits the dynamics towards steady state. In the short-term, agglomeration economics and locational fundamentals may have been important in the transition to spatial equilibrium. In Chauvin et al. (2017) aspects such as mobility, rent-earnings relationships or skill-related factors can influence the steady state configuration. Further study on these economic factors could help to improve our knowledge about urbanization in Belgium and shed some light on the population dynamics in countries currently experiencing the urbanization process.

## Notes

Quoting Gabaix and Ioannides (2004, p. 2353), “the casual impression of the authors is that in some decades, large cities grow faster than small cities, but in other decades, small cities grow faster.”

Formally,“Gibrat’s Law states that the growth rate of an economic entity (firm, mutual fund, city) of size S has a distribution function with mean and variance that are independent of S” (Gabaix and Ioannides 2004, p. 2346).

LOKSTAT, accessed on 23 September 2020 via https://lokstat.ugent.be/lokstat_over_doelstelling.php.

The unit of analysis of this database are municipalities as defined by their administrative borders. Amalgamating these into metropolitan areas is difficult for many reasons, the main problem being that the entirety of Belgium is described as one big metropolitan area. As noted in Van Meeteren et al. (2016, p. 3) the Flemish region is often described as a ‘nebular city’ and Belgium as an “hourglass-shaped metropolitan area whose (Walloon) base is less developed than its (Flemish) roof.”

Lalanne and Zumpe (2019) propose a new test-protocol for testing three different random growth processes (pure random growth, random growth with drift, and random growth with drift and trend) also based on the ADF statistic.

Alternative values of the threshold yield similar results, as the change in land area in most cities is negligible: for 82% of our Belgian cities, the change in land area is lower than 1%, for 86% the change is lower than 5%, and for 89% the change in land area is lower than 10%.

Some authors (Lalanne and Zumpe 2019) argue that panel unit root testing is far from being a panacea for the low test power problem. Indeed, as shown by Karlsson and Löthgren (2000), what most contributes to the increase of test power is the extension of the panel’s temporal dimension. Consequently, the only extension of the cross-sectional dimension, as commonly practised in the urban growth literature, is not a viable solution.

Baltagi et al. (2007) concluded that Pesaran and Phillips–Sul tests seem to be the least affected by spatial dependence among units.

The test is calculated using the ‘pescadf’ Stata package.

These results are available from the authors upon request.

## References

Amara M (2008) Des Belges à l’épreuve de l’exil: les réfugiés de la Première Guerre mondiale: France, Grande-Bretagne, Pays-Bas, 1914–1918. Editions de l’Université de Bruxelles, Bruxelles

Baltagi BH, Bresson G, Pirotte A (2007) Panel unit root tests and spatial dependence. J Appl Economet 22:339–360

Banerjee A, Massimiliano M, Osbat C (2005) Testing for PPP: should we use panel methods? Empirical Economics 30:77–91

Barrios S, Bertinelli L, Strobl E (2006) Geographic concentration and establishment scale: an extension using panel data. J Reg Sci 46(4):733–746

Ben-David D, Lumsdaine R, Papell DH (2003) Unit root, post-war slowdowns and long-run growth: evidence from two structural breaks. Empir Econ 28(2):303–319

Black D, Henderson V (2003) Urban evolution in the USA. J Econ Geogr 3(4):343–372

Bleakley H, Lin J (2012) Portage and path dependence. Q J Econ 127(2):587–644

De Block G, Polasky J (2011) Light railways and the rural–urban continuum: technology, space and society in late nineteenth-century Belgium. J Hist Geogr 37(3):312–328

Bosker EM, Brakman S, Garretsen H, Schramm M (2007) Looking for multiple equilibria when geography matters: German city growth and the WWII shock. J Urban Econ 61:152–169

Bosker EM, Brakman S, Garretsen H, Schramm M (2008) A century of shocks: the evolution of the German city size distribution 1925–1999. Reg Sci Urban Econ 38:330–347

Breitung J, Das S (2005) Panel unit root tests under cross-sectional dependence. Stat Neerl 59:414–433

Breitung J, Pesaran MH (2008) Unit roots and cointegration in panels. In: Matyas L, Sevestre P (eds) The Econometrics of Panel Data: Fundamentals and Recent Developments in Theory and Practice, 3rd edn. Springer Publishers, Berlin, pp 279–322

Caestecker F (2015) Hoe de mens de wereld vorm gaf. Academia Press, Gent

Caestecker F, Vanhaute E (2011) Leven en werken in het industriële België. In: Grauwels A et al (eds) Hedendaagse economische geschiedenis van België: een inleiding. Academia Press, Gent

Champernowne D (1953) A model of income distribution. Econ J LXIII:318–351

Chauvin JP, Glaeser E, Ma Y, Tobio K (2017) What is different about urbanization in rich and poor countries? Cities in Brazil, China, India and the United States. J Urban Econ 98:17–49

Chen Z, Fu S, Zhang D (2013) Searching for the parallel growth of cities in China. Urban Stud 50(10):2118–2135

Clark JS, Stabler JC (1991) Gibrat’s law and the growth of Canadian Cities. Urban Stud 28(4):635–639

Clemente J, Montañés A, Reyes M (1998) Testing for a unit root in variables with a double change in the mean. Econ Lett 59:175–182

Córdoba JC (2008) A generalized Gibrat’s Law for Cities. Int Econ Rev 49(4):1463–1468

Davis DR, Weinstein DE (2002) Bones, bombs, and break points: the geography of economic activity. Am Econ Rev 92(5):1269–1289

Debruyne E (2007) Gedoogbeleid in al zijn gedaanten Joodse vluchtelingen en België (januari 1933-september 1939). In: Van Doorslaer, R. (eds.),

*Gewillig België Overheid en Jodenvervolging in België tijdens de Tweede Wereldoorlog*. Brussel: SOMADe Decker P (2011) Understanding housing sprawl: the case of Flanders. Belgium Environment and Planning A 43(7):1634–1654

De Meulder B et al (1999) Patching up the Belgian urban landscape. Oase 52:78–113

Devadoss S, Luckstead J (2015) Growth process of U.S. small cities. Econ Lett 135:12–14

Dickey DA, Fuller WA (1979) Distributions of the estimators for autoregressive time series with a unit root. J Am Stat Assoc 74(366):427–481

Dickey DA, Fuller WA (1981) Likelihood ratio statistics for autoregressive time series with a unit root. Econometrica 49(4):1057–1072

Duranton G (2007) Urban evolutions: the fast, the slow, and the still. American Economic Review 97(1):197–221

Eaton J, Eckstein Z (1997) Cities and growth: theory and evidence from France and Japan. Reg Sci Urban Econ 27(4–5):443–474

Eeckhout J (2004) Gibrat’s Law for (All) Cities. Am Econ Rev Am Econ Assoc 94(5):1429–1451

Ellison G, Glaeser EL (1999) The geographic concentration of industry: does natural advantage explain agglomeration? Am Econ Rev Papers Proc 89(2):311–316

Frees EW (1995) Assessing cross-sectional correlation in panel data. J Econom 69:393–414

Gabaix X (1999) Zipf’s law for cities: an explanation. Quart J Econ 114(3):739–767

Gabaix X, Ioannides YM (2004) The evolution of city size distributions. In: Thisse JF, Henderson JV (eds) Handbook of urban and regional economics. Elsevier Science, Amsterdam, pp 2341–2378

Garcia-López M-A, Holl A, Viladecans-Marsal E (2015) Suburbanization and highways in Spain when the Romans and the Bourbons still shape its cities. J Urban Econ 85:52–67

Gengenbach C, Palm FC, Urbain JP (2010) Panel unit root tests in the presence of cross-sectional dependencies: comparison and implications for modelling. Economet Rev 29:111–145

Glaeser EL, Shapiro J (2003) Urban growth in the 1990s: is city living back? J Reg Sci 43(1):139–165

González-Val R (2010) The evolution of US city size distribution from a long term perspective (1900–2000). J Reg Sci 50(5):952–972

González-Val R, Lanaspa L (2016) Patterns in U. S. urban growth (1790–2000). Reg Stud 50(2):289–309

González-Val R, Lanaspa L, Sanz-Gracia F (2014) New evidence on Gibrat’s law for cities. Urban Studies 51(1):93–115

González-Val R, Silvestre J (2020) An annual estimate of spatially disaggregated populations: Spain, 1900–2011. The Annals of Regional Science, forthcoming

Henderson V, Wang HG (2007) Urbanization and city growth: the role of institutions. Reg Sci Urban Econ 37(3):283–313

Holmes TJ, Stevens JJ (2002) Geographic concentration and establishment scale. Rev Econ Stat 84(4):682–690

Ioannides YM, Overman HG (2003) Zipf’s law for cities: an empirical examination. Reg Sci Urban Econ 33:127–137

Karlsson S, Löthgren M (2000) On the power and interpretation of panel unit root tests. Econ Lett 66:249–255

Lalanne A (2014) Zipf’s law and Canadian urban growth. Urban Studies 51(8):1725–1740

Lalanne A, Zumpe M (2019) La croissance des villes canadienne et australienne guidée par le hasard Enjeux et mesure de la croissance urbaine aléatorie. Can J Reg Sci 42(2):123–129

Lalanne A, Zumpe M (2020) Time-series based empirical assessment of random urban growth: new evidence for France. J Quant Econ 18(4):911–926

Luyckx L (2010) Russische krijgsgevangenen van de nazi’s: van Displaced Persons tot vluchtelingen (voor het Sovjetcommunisme). Belgisch Tijdschrift Voor Nieuwste Geschiedenis, XL 3:489–511

Ng S, Perron P (1995) Unit root tests in ARMA models with data dependent methods for the selection of the truncation lag. J Am Stat Assoc 90:268–281

Partridge MD, Rickman DS, Ali K, Olfert MR (2008) Lost in space: population growth in the American hinterlands and small cities. J Econ Geogr 8(6):727–757

Perron P (1989) The great crash, the oil price shock and the unit root hypothesis. Econometrica 57:1361–1401

Perron P, Vogelsang T (1992) Nonstationarity and level shifts with an application to purchasing power parity. J Bus Econ Stat 10:301–320

Pesaran MH (2007) A simple panel unit root test in the presence of cross-section dependence. J Appl Econom 22:265–312

Pesaran, M. H., (2004). General diagnostic tests for cross section dependence in panels. IZA Discussion Paper No. 1240.

Phillips PCB, Shi S, Yu J (2015) Testing for multiple bubbles: historical episodes of exuberance and collapse in the S&P 500. Int Econ Rev 56(4):1043–1077

Phillips PCB, Wu Y, Yu J (2011) Explosive behavior in the 1990s NASDAQ: when did exuberance escalate asset values? Int Econ Rev 52(1):201–226

Resende M (2004) Gibrat’s law and the growth of cities in Brazil: a panel data investigation. Urban Studies 41(8):1537–1549

Ronsse S, Standaert S (2017) Combining growth and level data: an estimation of the population of Belgian municipalities between 1880 and 1970. Hist Method J Quant Interdiscip Hist 50(4):218–226

Ryckewaert M (2011) Building the economic backbone of the Belgian welfare state: infrastructure, planning and architecture 1945–1973. Uitgeverij, Rotterdam

Sharma S (2003) Persistence and stability in city growth. J Urban Econ 53:300–320

Simon H (1955) On a class of skew distribution functions. Biometrika 42:425–440

Van Meeteren M, Boussauw K, Derudder B, Witlox F (2016) Flemish diamond or ABC-Axis? The spatial structure of the Belgian metropolitan area. Eur Plan Stud 24(5):974–995

Vrielinck, S., (2000). De territoriale indeling van Belgie (1795–1963): bestuursgeografisch en statistisch repertorium van de gemeenten en de supracommunale eenheden (administratief en gerechtelijk): met de officiele uitslagen van de algemene volkstellingen. Vol. 1. Leuven University Press, 2000

Witte E, Meynen A (2006) De geschiedenis van België na 1945. Standaard Uitgeverij, Antwerp

## Acknowledgements

The authors benefited from the helpful comments and suggestions from Wenzheng Li, Ilyes Boumahdi and two anonymous referees. An earlier version of this paper was presented at the 7th International conference on Time Series and Forecasting (Gran Canaria 2021), at the 68th North America Meetings of the Regional Science Association International (virtual conference 2021), at the 61st Congress of the European Regional Science Association (virtual conference 2022) and at the XLVII Reunión de Estudios Regionales (Granada 2022), and all the comments made by participants are much appreciated. All remaining errors are ours.

## Funding

Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. Ministerio de Ciencia e Innovación and Agencia Estatal de Investigación, MCIN/AEI/10.13039/501100011033 (projects ECO2017-82246-P, PID2020-114354RA-I00, PID2020‑112773GB‑I00), Gobierno de Aragón (project S39_20R, ADETRE research group), and ERDF.

## Author information

### Authors and Affiliations

### Corresponding author

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

González-Val, R., Ramos, A. & Standaert, S. Urban growth in the long term: Belgium, 1880–1970.
*Ann Reg Sci* **72**, 881–902 (2024). https://doi.org/10.1007/s00168-023-01226-1

Received:

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.1007/s00168-023-01226-1