1 Introduction

Urban growth literature has a long tradition. Why some cities grow while others decline is still an open question, although several theoretical explanations have been proposed. These theories can be summarised into three main drivers of growth (Davis and Weinstein 2002): the existence of increasing returns to scale, the importance of locational fundamentals, and random growth.

Each of these drivers of urban growth involves different theoretical mechanisms. The existence of increasing returns suggests the presence of endogenous mechanisms in city growth that can lead to multiple equilibria (Davis and Weinstein 2002; Bosker et al. 2007), depending on the initial conditions. Locational fundamental theory highlights the role played by geographical characteristics: the presence of a natural harbour, a specific climate, or access to the sea, among many other physical characteristics, can determine cities’ populations (for instance, Ellison and Glaeser (1999) stated that natural advantages can explain at least half of the observed geographic concentration in the US). Finally, random urban growth postulates that population growth in cities is a random variable.

Studies testing the influence of increasing returns to scale and locational fundamentals have usually relied on parametric (cross-sectional or panel data) growth regressions, applying an instrumental variable approach in most cases. The latest advances in this literature have come from the use of plant‐level data (Holmes and Stevens 2002; Barrios et al. 2006) and case studies using an identification strategy of instruments that reveals the influence of some historical events on cities’ growth path (e.g., Bleakley and Lin 2012; Garcia-López et al. 2015). However, most of these studies have adopted a short-term perspective, and even panel data analyses have considered a few decades at most.

The approach taken in the random growth literature is different. First, from the theoretical point of view, random growth can only hold as a long-run average,Footnote 1 while the influence of other factors, like locational fundamentals and increasing returns, may change (or even disappear) over time. With random urban growth, the growth process of cities tends to be multiplicative and independent of their initial size, a proposition that has become known in urban economics as Gibrat’s law.Footnote 2 Several theoretical models (Gabaix 1999; Duranton 2007; Córdoba 2008) were developed to explain the fulfilment of Gibrat’s law in the context of external urban local effects and productive shocks, associating it directly with an equilibrium situation. Therefore, city-level variables can explain temporal variation in growth rates across cities, but random growth theory provides an appropriate explanation for the long-term growth.

Second, on the empirical side, although seminal contributions (e.g., Eaton and Eckstein 1997) have also used parametric growth regressions to test Gibrat’s law, since the 2000s, several studies have proposed alternative methodologies to parametric growth models. González-Val et al. (2014) reviewed this literature, concluding that most studies today use nonparametric estimates of urban growth or unit root tests. Nonparametric estimates of growth have become popular in this literature, providing estimates of growth that vary with the initial population over the entire distribution of city sizes. However, these kernel regressions estimate the unconditional relationship between growth and size; city and time fixed effects and any other control variables are omitted. Thus, authors have carried out nonparametric analyses for cross-sectional data (Eeckhout 2004) as well as for a pool of growth rates from different time periods (Ioannides and Overman 2003; González-Val 2010).

The use of the panel data methodology and unit root tests in the analysis of urban growth, first suggested by Clark and Stabler (1991), can provide a more precise test of Gibrat’s law. This idea was emphasized by Gabaix and Ioannides (2004, p. 2358), who expected “that the next generation of city evolution empirics could draw from the sophisticated econometric literature on unit roots.” Recently, new methods have been proposed to test random growth using unit root tests. Lalanne and Zumpe (2019, 2020) apply an integrated model selection/unit root test protocol with three different model specifications (pure random growth, random growth with drift, and random growth with drift and trend) to a large sample of high-quality French decennial census data, on how the rejection of the random growth hypothesis accounts for less than one third of tested cities.

However, some empirical limitations have reduced the spread of these techniques. While several papers applied panel data unit root tests to analyse urban growth (e.g., Black and Henderson 2003; Resende 2004; Henderson and Wang 2007; González-Val and Lanaspa 2016), the list of studies looking for unit roots in individual time series of cities’ populations is quite short. Why? Unit root tests need large sample sizes (at least 40 observations) to have reasonable power (Clark and Stabler 1991). However, long time series of year-by-year city populations are usually not available, and studies on the temporal evolution of city sizes have considered decennial census data in most cases. Therefore, the lack of annual data for a sample of cities over a long time period on a consistent basis has limited the use of unit root testing in empirical work.

To our knowledge, only five studies consider annual city populations to test Gibrat’s law using unit root tests: Clark and Stabler (1991), Resende (2004), Sharma (2003), Bosker et al. (2008) and Chen et al. (2013). Clark and Stabler (1991) used data on the seven largest cities in Canada from 1975 to 1984 (10 temporal observations by city), while Resende (2004) used a panel data set with annual data for 497 municipalities in the state of São Paulo (Brazil) for the 1980–2000 period. Sharma (2003) considered a sample of 100 Indian cities for the period 1901–1991 (90 years), Bosker et al. (2008) used a dataset of 62 West German cities from 1925 to 1999 (except for five missing years during the Second World War), and Chen et al. (2013) considered 210 Chinese cities from 1984 to 2006 (23 temporal observations).

Although the efforts of these authors to obtain annual city population data and exploit the properties of the unit root tests fully are worthy, these studies still show an important limitation: they focused on the largest cities. Nevertheless, some studies have confirmed the different patterns of growth of small cities (Partridge et al. 2008; Devadoss and Luckstead 2015) and, thus, the behaviour of the largest cities cannot be extrapolated to the whole distribution of cities. Another reason to consider all cities is that some studies found differences in growth patterns between different regions within the same country. Lalanne (2014) studied the hierarchical structure of the Canadian urban system, splitting the territory into two parts (east and west), thus allowing for the identification of different dynamics.

In this paper, we take advantage of Ronsse and Standaert’s (2017) new data set of Belgian cities. This unique data set provides annual estimates of the population for all Belgian municipalities from 1880 to 1970. Thus, it allows us to carry out a robust long-term analysis of urban growth because the time dimension is long (90 temporal observations by city) and, at the same time, it contains information for all cities, covering the whole city size distribution. Therefore, as far as we know, this is the most comprehensive test of Gibrat’s law using unit root tests ever conducted.

Moreover, the Belgian case is interesting because of some specific historical characteristics of the country. As a relatively young country on the European continent, the Belgian state came into existence following a liberal revolution in 1830. Set up as a parliamentary democracy, headed by a monarch with limited powers, the Belgian state quickly became a haven for political liberalism in 19th-century Europe. At the same time, however, the young nation wanted to ensure that it had the most up-to-date information about the population living within its borders. Following the newest scientific methods, the state apparatus created a highly developed statistical department, which, whilst not unique, was well ahead of its time. The continuous efforts of this department have led to a richness of statistical data spanning the entire history of the country.Footnote 3

When studying the trends of population growth in this small state, surrounded by giants, it is also important to note that Belgium was the first industrialized country on the European continent. A combination of technological knowhow, labour, capital and readily available natural resources led to a quick-paced industrialization process, initially concentrated in the city of Ghent and in the southern part of the country (Caestecker 2015, pp. 101–103). Only after the Second World War did the industrial centre of gravity move to the north, with the arrival of many multinational companies that wanted to make use of the vicinity of sea harbours and the excellent road infrastructure of the region (Witte and Meynen 2006; Ryckewaert 2011).

Akin to this industrialization, but equally motived by political interests and goals, Belgium also became the centre of an expansive railway network from the late nineteenth century onwards. The construction of the railway had already begun in the 1830s, but the connections were initially limited to those between major urban and industrial hubs. From the 1880s onwards, however, there was a massive effort to expand the railway network to even the smallest towns. In a country of only 30,528 km2, the main rail network grew from 3000 km in 1870 to almost 5000 km in 1910. Concurrently, a 3000 km network of light rail was established that “meander within the landscape, from village to village, collecting as many intermediary passengers and goods as possible” (De Block and Polasky 2011, p.318).

The high degree of connectivity resulting from this extensive network – as well as a system of cheap train tickets for workers – ensured a steady flow of labour from the still densely populated countryside to the industrial centres of the country, without the large-scale urbanization – and accompanying politicization of the labour force – that characterized the industrialization process in Belgium’s neighbouring countries (De Meulder et al. 1999). Before the end of the twentieth century, more than one in three Belgians commuted to and from work every day (De Decker 2011).

These factors affected the distribution of economic activity and population in Belgium during the course of the twentieth century, and set the grounds for a new long-term equilibrium of the distribution of population. However, as both industrialization and completion of transport infrastructures took place at the beginning of our sample period (in the late nineteenth and early twentieth centuries) that covers almost one century, we expect to capture not the short-term transitional periods of higher or lower growth, but rather the long-term growth that may follow a random growth pattern.

The remainder of the paper is organized as follows. In Sect. 2, we describe the data set used. Section 3 explains the different unit root tests carried out, distinguishing between time series analysis of unit roots allowing for structural breaks and panel data unit root tests, and presents the main results. Section 4 concludes.

2 Data

As far as we know, only a few studies have provided long-term estimates of annually and spatially disaggregated populations at the city level: Sharma (2003), Bosker et al. (2008) and González-Val and Silvestre (2020), for Indian (100 cities, 1901–1991), West German (62 cities, 1925–1999 period with some gaps) and Spanish cities (49 capital cities, 1900–2011), respectively.

In this paper, we use the new data set by Ronsse and Standaert (2017). They constructed a data set of 2680 Belgian municipalities for the period 1880–1970. Their dataset ends in 1970, as this is the last decade in which administrative borders remained relatively constant. Specifically, in 1970 and 1971, the number of municipalities first dropped by 200. This decrease was followed by the main administrative redrawing of the municipalities in 1976, which reduced the number of municipalities by three quarters. In all, the total number of municipalities dropped from 2675 in 1970 to a mere 596 six years later (Vrienlinck, 2000). Further redrawing of municipal maps followed in 1983 and 2019 and resulted in the current total of 581 municipalities, although this value is set to change again in 2025. Given the impact of these changes on the continuity of their dataset, the authors chose 1970 as the final year.Footnote 4

To compose this data set, they combined the population census data, which are collected every ten years, with the yearly data on births, deaths and migration from the city population registers (the movement data). The less centralized nature of the latter, in combination with a tendency to underreport outward migration, means that these series cannot easily be combined without creating breaks in the data with every new census. This is illustrated quite well in Fig. 1, in which the dotted red lines show what the population estimate would be if we added the cumulative changes to the population to the last set of census data. The population in Brussels is overestimated by as many as a 100,000 people in 1900. At other times, especially in the later years of the data set, the quality of the population registers can be much higher.

Fig. 1
figure 1

Source: Ronsse and Standaert (2017)

The population of the four biggest cities in Belgium from 1880 to 1970. The estimated level of the population is indicated by the thick black line and its 95% confidence interval by the blue shaded area. The source data are plotted using the red crosses (census) and dotted line (population registers). Administrative changes to the city borders are indicated by the black asterisks.

To reconcile the two data series, Ronse and Standaert (2017) used a state-space approach, which models the mouvement data as a noisy signal of the true change in the population growth. As the mouvement is collected by each city individually, the noisiness of its signal is allowed to differ for each city. Furthermore, the model incorporates information on the changes to the administrative borders of the cities, allowing the population data to change more drastically in those years (see e.g. Brussels in 1921 in Fig. 1). The results of the models are also shown in Fig. 1, with the thick black lines showing the estimated population and the blue shaded area its 95% confidence interval. The estimated population data clearly follow the change indicated by the mouvement data as much as possible while remaining consistent with the census data.

As we noted, the period of analysis in the dataset was chosen so that the administrative borders remain consistent. Included in the sample, however, is the first wave of administrative changes of 1964, that reduced the number of municipalities from 2675 to 2586. In addition to these large-scale changes, information from Vrienlinck (2000) was used to identify smaller, ad-hoc changes to the administrative borders of municipalities. As a robustness check, cities with more than a 30% change in land area in the period considered were excluded from the analysis.

For illustrative purposes, Table 1 shows the number of cities and the descriptive statistics in census years. The minimum values of each decade indicate that we are considering all municipalities, without size restrictions, even the smallest units. Although their urban character might be debatable, Eeckhout (2004) suggested considering the whole distribution. The mean population grows over time and, at the same time, the number of cities remains stable, which points to an urbanization process. Moreover, although the standard deviation increases from 1880 to 1910, indicating increasing inequality in city sizes, it remains quite stable in the last decades, which suggests stability in the city size distribution.

Table 1 Summary of the descriptive statistics

Municipalities comprise the country’s total land area and, therefore, the entire population. During the period considered (almost a century), there was an important increase (71.2%) in the Belgian total population in our sample, from 5,517,017 in 1880 to 9,443,985. One might expect that such a huge increase also generated important changes in the city size distribution. Figure 2 shows the empirical density functions for three representative periods (1880, 1930 and 1970) estimated using adaptive kernels for our sample of all Belgian municipalities. We define the relative size of a city as the quotient between the city’s population and the contemporary average population. The graph shows the distribution of these cities’ relative sizes in log scale, with the zero value in the x-axis representing medium-sized cities.

Fig. 2
figure 2

Empirical density functions. Adaptive kernels of the relative size of the Belgian cities (ln scale)

The shape of the city size distribution changed dramatically from 1880 to 1970. In 1880, we can observe a very leptokurtic distribution with a great deal of density concentrated in the central values. However, by 1930, the distribution had lost kurtosis and the concentration had decreased. In 1970, we find a more uneven distribution with heavier tails than in previous periods, indicating the presence of more small and large cities than in earlier years. The temporal evolution of the city size distribution suggests a divergence pattern in the growth of Belgian cities, thus apparently discarding the idea random growth.

3 Methodology and results

3.1 Unit roots in city sizes

Our basic hypothesis for the long-term growth of Belgian cities is random growth (Gabaix and Ioannides 2004; González-Val et al. 2014). As mentioned above, random growth can hold as a long-run average, while the effect of other factors may change or dissipate over time. Traditional theoretical models of random growth (Champernowne 1953; Simon 1955) are based on stochastic growth processes and probabilistic models. Recent urban economic theories (Gabaix 1999; Duranton 2007; Córdoba 2008) include economic factors driving random shocks (e.g., external urban local effects or productive shocks), and these models are able to reproduce two empirical regularities that are well known in urban economics: Zipf’s and Gibrat’s laws (or the rank-size rule and the law of proportionate growth, respectively).

We follow the methodology proposed by Clark and Stabler (1991), who suggested that testing for random growth is equivalent to testing for the presence of a unit root. Starting from a simple autoregressive (AR) growth model, they assumed that the relationship between the size of a city in time period \(t\) and that in time period \(t - 1\) is

$$S_{it} = \rho_{it} S_{it - 1} ,$$
(1)

where \(S_{it}\) is the city share of city \(i\) defined as the quotient derived from dividing the city’s population (\(Pop_{it}\)) by the contemporary total population of the country, \(S_{it} = {{Pop_{it} } \mathord{\left/ {\vphantom {{Pop_{it} } {Country{\text{ pop}}_{t} }}} \right. \kern-0pt} {Country{\text{ pop}}_{t} }}\) because, from a long-term temporal perspective, it is necessary to use a relative measure of size (Gabaix and Ioannides 2004). \(\rho_{it}\) is the growth rate of city \(i\) over the period \(t - 1\) to period \(t\). This growth rate can be decomposed into two (Clark and Stabler 1991) or three components (Bosker et al. 2008): a random component, a non-stochastic component relating the current growth rate to a (possibly time-varying) constant and past growth rates, and the initial city size. Then, after some algebra, Clark and Stabler (1991) obtained the following expression:

$$\Delta s_{it} = c_{i} + \Theta_{i} s_{it - 1} + \sum\limits_{j = 1}^{k} {\beta_{ij} \Delta s_{it - j} } + \varepsilon_{it} ,$$
(2)

where \(s_{it} = \ln \left( {S_{it} } \right)\) is the log-city share of city i in year t, \(\Delta s_{it} = s_{it} - s_{it - 1}\), \(c_{i}\) is a constant, \(\beta_{ij}\) is a parameter measuring the influence of past growth rates on current city growth and \(k\) is the number of lags added to ensure that the residuals, \(\varepsilon_{it}\), are Gaussian white noises. \(\Theta_{i}\) is the key parameter that captures the effect of the initial city size on growth. Random growth would imply \(\Theta_{i} = 0\), meaning that the growth of a particular city does not depend on its initial city size. Then, the city share would be a non-stationary time series, and any sudden shock would have permanent effects on the long-run level of the population of the city (Davis and Weinstein 2002). This shows that testing for random growth (Gibrat’s law) is equivalent to testing for a unit root in city sizes. Evidence supporting a unit root (if \(\Theta_{i}\) is not significant) means that city \(i\)’s growth rate is independent of the initial size. By contrast, when \(\Theta_{i} < 0\), the path of city \(i\) will be a stationary process (mean reversion). By using Eq. (2), Clark and Stabler (1991) apply the standard Dickey and Fuller (1979) t-statistic, failing to reject random growth for the seven largest cities in Canada from 1975 to 1984.

To test for the presence of unit roots, we run the Augmented Dickey–Fuller (ADF) test (Dickey and Fuller 1979, 1981).Footnote 5 The ADF test for non-trending data is carried out by running Eq. (2). Following Ng and Perron (1995), we choose the optimal k using a “general-to-specific procedure” based on the t-statistic. The null and alternative hypotheses are, respectively, \(H_{0} :\Theta_{i} = 0\), \(H_{A} :\Theta_{i} < 0\). If \(\Theta_{i}\) is found to be equal to 0, then the city share series follows a random walk and, on the other hand, if \(\Theta_{i}\) is found to be significantly smaller than 0, the city share is stationary around \(c_{i}\).

Table 2 (second column) reports a summary of the results of the individual city unit root tests. We find that the null hypothesis of a unit root in the city share is not rejected for most of the cities in the sample. In particular, for 207 of the 2680 Belgian cities (8%), the unit root is rejected at the 10% level, thus strongly supporting the random growth hypothesis. However, a possible concern with these results is that the overall non-rejection of the unit root hypothesis may be because the standard ADF tests are biased (Perron 1989). It is possible that what we identified as a unit root process could be better modelled as a stationary process around highly permanent shocks, especially when such a long period is considered. In their study of German cities, Bosker et al. (2008) addressed this issue by allowing for the presence of a one-time structural break when testing for unit roots, using the unit root test suggested by Perron and Vogelsang (1992). Here, we follow the same approach.

Table 2 Results of unit root tests on city shares: All Belgian cities

Following Bosker et al. (2008), we estimate additive outlier (AO) models, allowing for a sudden change in mean (crash model). The AO model is appropriate when the change is assumed to take effect instantaneously (for instance, because of warfare destruction). This model is estimated by way of a two-step procedure. The first step removes the deterministic part of the series by estimating the regression

$$s_{it} = \mu_{i} + \delta_{i} DU_{t} + \eta_{it} ,$$
(3)

where \(DU_{t} = 0\) if \(t \le TB\)(the break date) and 1 otherwise. The resulting residuals are then tested for the presence of a unit root by estimating

$$\eta_{it} = \sum\limits_{j = 0}^{k} {\omega_{j} } DTB_{it - j} + \rho \eta_{it - 1} + \sum\limits_{j = 0}^{k} {c_{j} \Delta \eta_{it - j} } + \varepsilon_{it} ,$$
(4)

where \(\eta_{it}\) is the estimated residual from Eq. (3), TB is the break date and \(DTB_{it} = 1\) if \(t = TB + 1\) and 0 otherwise. Both equations are estimated using OLS for each break year \(TB = k + 2,...,T - 1\), with T being the number of observations and k being the truncation lag parameter. The null hypothesis of a unit root is rejected if the t-statistic for \(\rho\) is significant. In this case, the city share will be a stationary time series around a structural break. All but one shock (the break) would cause temporary movements of the city’s share. By contrast, if the t-statistic for \(\rho\) is not significant, the city share will be a non-stationary time series and any sudden shock will have permanent effects on the long-run level of the city share.

The results of applying the AO-model to test for a unit root in city shares under the null of a unit root versus stationarity around a possibly shifting mean under the alternative are also summarized in Table 2 (third column). Although the percentages of rejection of a unit root increase slightly, the results do not vary substantially from those of the ADF test. At the 10% level, the unit root null hypothesis cannot be rejected in favour of a stationary city share with a one-time break for roughly 85% of the cities (2274 of 2680 cities). Moreover, the breaks are significant in almost all cases; only for 51 cities is the break not significant. Therefore, evidence supporting random growth persists.

The previous analysis only captures the single most significant break in each city share series. However, since the period considered is quite long (almost one century) and variables rarely show just one break, we also attempt to determine whether the city share series show a double change in the mean. We use the test developed by Clemente et al. (1998), who based their approach on Perron and Vogelsang (1992), but allowing for two breaks. Formally, (3) and (4) change to:

$$s_{it} = \mu_{i} + \delta_{i1} DU_{1t} + \delta_{i2} DU_{2t} + \eta_{it} ,$$
(5)

and

$$\eta_{it} = \sum\limits_{j = 0}^{k} {\omega_{1j} } DTB_{1t - j} + \sum\limits_{j = 0}^{k} {\omega_{2j} } DTB_{2t - j} + \rho \eta_{it - 1} + \sum\limits_{j = 0}^{k} {c_{j} \Delta \eta_{it - j} } + \varepsilon_{it} ,$$
(6)

where \(DU_{jt} = 1\) if \(t > TB_{j}\) \(\left( {j = 1,2} \right)\) and 0 otherwise. \(DTB_{ijt}\) sets equal 1 if \(t = TB_{j} + 1\) and 0 otherwise \(\left( {j = 1,2} \right)\). \(TB_{1}\) and \(TB_{2}\) are the time periods when the mean is being modified. Like Clemente et al. (1998), we suppose that \(TB_{j} = \lambda_{j} T\) \(\left( {j = 1,2} \right)\), with \(0 < \lambda_{j} < 1\), which implies that the test is not defined at the limits of the sample, and that \(\lambda_{2} > \lambda_{1}\), which eliminates those cases in which breaks occur in consecutive periods. To test for the unit root null hypothesis, Eq. (5) is first estimated using OLS to remove the deterministic part of the variable, and then the test is carried out by searching for the minimal pseudo t-ratio for the \(\rho = 1\) hypothesis in Eq. (6) for all the break time combinations. The null hypothesis of a unit root is rejected if the t-statistic for \(\rho\) is significant. In this case, the city share will be a stationary time series around two structural breaks. Most shocks would cause temporary movements of the city’s share, but two shocks (the breaks) would cause permanent effects. Unlike the situation in which the t-statistic for \(\rho\) is not significantly different from zero, the city share will be a non-stationary time series and any sudden shock will have permanent effects on the long-run level of the city’s share of the population.

We would expect that allowing for the possibility of two endogenous break points could provide further evidence against the unit root hypothesis (Lumsdaine and Papell, 1997; Ben-David et al. 2003), especially because both breaks are significant in most of the cases: only in 45 and 49 cases (of 2680) are the first and the second break, respectively, not significant. However, the percentage of unit roots rejected at the 10% level is lower than that for the one-break test (see Table 2, fourth column). Furthermore, at the 5% and 10% significance levels, the percentage of rejection is even lower than that of the ADF test (no-break scenario). There is no clear pattern in the size of the cities for which the unit root is rejected; their mean population is slightly above the average size for all cities from 1880 to 1890, but it slowly decreases over time, and, over the whole period, these city sizes are on average only 12% below the mean population of all cities. Therefore, on average, these cities tend to be slightly below the mean city size but certainly are not the smallest units in our sample.

Regarding the dates of the breaks, Fig. 3 displays the distribution of the breaks’ timing. The y-axis shows the frequency (i.e., the total number of significant breaks detected) by year (the x-axis). It shows both the one- and two-break cases; only significant breaks are included in the graph. Across the one- and two-break cases, three distinct major events stand out: the First World War, the economic crisis of 1929–1933 and the Second World War. Most distinctly present in the one-break case, the Great Depression of the early 1930s, following the crash of the American stock exchange in 1929, undoubtedly carries with it a lot of explanatory weight (Caestecker 2015, pp. 133–134). As unemployment soared, so did the mobility of those people who were looking for alternative ways to make a living.

Fig. 3
figure 3

Distribution of the timing of the breaks

A decade earlier, the impact of the First World War is clearly visible in both the one-break and the two-break first-break cases. In Belgium, the onset of the war caused an enormous stream of refugees to flee their home towns and settle in other places, both within the Belgian territory and abroad. Some 1.5 million Belgian refugees left the country to settle temporarily in the Netherlands, France and Great Britain (Amara 2008). During and in the aftermath of the war, there was an enormous drop in the Belgian population, caused by a fall in the number of births and a peak in the number of deaths due to both the war and the Spanish flu that directly followed it (Caestecker and Vanhaute 2011, pp. 94–96). That the break in population growth is only visible in the aftermath of the war is probably related to the fact that, during the war, administrative services were severely disrupted.

Even though there was no internal displacement on such a large scale during the Second World War, the onset and aftermath of the war present a clear break in population growth, visible both in the one-break cases and in the two-break cases’ second break. The onset of the war, however, did cause some people to flee abroad, and, just before the outbreak, many thousands of refugees from Nazi Germany and the occupied territories arrived in Belgium, looking for shelter (Debruyne 2007, pp. 75–76). In the aftermath of the war, as Belgian refugees returned home, thousands of displaced persons found themselves on Belgian territory. Most of them were former prisoners of war, brought to Belgium by the Nazi occupants to perform forced labour (Luyckx 2010). The fast economic recovery of Belgium in the immediate aftermath of the war might also have led to internal migration towards the revived industrial centres, where plenty of jobs were to be found (Witte and Meynen 2006, pp. 35–36).

Two more important moments of change can only be deduced in a meaningful way from the two-break case. In the two-break case, for the first break, we can see a clear break around the turn of the nineteenth century. The last two decades of that century heralded a period of economic crisis, resulting in growing unemployment and poverty. In these turbulent times, thousands of people moved to another town, where they hoped to find a more secure income. Thousands of others wandered around the country, looking for work. Finally, Belgium also experienced a demographic boom at the end of the nineteenth century, due to a spectacular rise in life expectancy (Caestecker and Vanhaute 2011, pp. 89–94). This certainly forms part of the explanation for the sudden break in population growth that we can discern at the fin de siècle.

The final break in growth, which we can only meaningfully discern in the two-break case, as the 2nd break, is noticeable around the year 1955. For this change in the pattern, we can find no immediate explanation. The 1950s were a period of slow economic growth in Belgium, characterized by relatively high unemployment. Compared with the following years, the Golden Sixties, the mid-1950s can be seen as a last moment of high mobility before a long decade of relative stability, connected to a boom in economic growth and a historically low unemployment rate (Witte and Meynen 2006).

Finally, we run several robustness checks. First, we consider a sub-sample of cities. The geographic boundaries of cities could change over such a long time period and, thus, in some cases, the city growth or the structural break may only be reflecting the change in city boundaries. We have information about administrative changes to the boundaries of Belgian cities and the size of the municipalities from 1865 to 1970 (every 10 or 15 years). Using this historical information, we can calculate the change in land area by city and, following Glaeser and Shapiro (2003), we can set a threshold for the change in land area and exclude all cities above it. In particular, we exclude all cities with more than a 30% change (positive or negative) in land area, thus reducing the sample size to 2476 cities.Footnote 6 This correction eliminates extreme cases in which the city in 1880 is very different from the city in 1970. Then, we rerun all the unit root tests; the results are reported in Table 3. The first conclusion is that the results are quite similar to those shown in Table 2, which indicates that the issue of changing in boundaries was not driving our main results. The second implication of these results is that the support for a unit root in city shares is even stronger when we exclude changing boundary units; although the percentages of unit roots rejected for the no-break scenario test are the same as in Table 2, when one or two breaks are allowed, these percentages decrease slightly to very low figures. For instance, only in 1% of the cases is the unit root rejected at the 5% confidence level under the two-break specification.

Table 3 Robustness check: Results of unit root tests on city shares excluding cities with more than 30% of change in land area

Second, we re-run the time series analysis considering the generalised supremum ADF (GSADF) test statistic for explosive behaviour proposed by Phillips et al. (2011, 2015). This recursive test does not assume exogenous structural breaks. For all the cities, the percentage of unit roots rejected at the 5% level increases up to the 36% (976 cities out of 2680). Nevertheless, the main result holds, as we cannot reject a unit root for most cities (64%).

3.2 Panel unit root test

Most theories proposed for the underlying mechanisms that govern the city size distribution are dynamic, and they make predictions of how particular cities will behave in a panel. To take full advantage of the panel dimension of our data set, we also test for a unit root in a panel.

Some authors have also tested for the presence of a unit root using growth equations and panel data (Black and Henderson 2003; Resende 2004; Henderson and Wang 2007; Chen et al. 2013). Nevertheless, this approach has some problems and limitations (Gabaix and Ioannides 2004; Bosker et al. 2008). As Chen et al. (2013) point out, panel unit root tests with short temporal dimension can suffer from low power. Although our time dimension is higher than that in previous studies (91 temporal observations for most cities), it may be still a low number, in contrast to the number of cities in the panel.Footnote 7

Our annual data overcome one of the common limitations in this literature, namely the use of census data and decade-by-decade city populations, but an econometric issue persists: the presence of cross-sectional dependence across the cities in the panel can give rise to estimations that are not very robust (González-Val and Lanaspa 2016). Cross-sectional dependence implies that the cities are interdependent. The causes of cross-sectional dependence in the errors can be the presence of common shocks and unobserved components that ultimately become part of the error term, spatial dependence and idiosyncratic pair-wise dependence in the disturbances with no particular pattern of common components or spatial dependence (Baltagi et al. 2007). The econometric literature has established that the panel unit root and stationarity tests that do not explicitly allow for this feature among individuals present size distortions that can lead to misleading inference (Banerjee et al. 2005). Moreover, Baltagi et al. (2007) examined the performance of several panel unit root tests under spatial dependence, and found that tests assuming cross-section independence perform better.

Therefore, following González-Val and Lanaspa (2016), among all tests available in the literature especially developed to deal with this issues (Breitung and Das 2005; Breitung and Pesaran 2008; Gengenbach et al. 2010), we use Pesaran’s (2007) test for unit roots in heterogeneous panels with cross-sectional dependence.Footnote 8 The test of the unit root hypothesis is based on the t-ratio of the OLS estimate of \(b_{i}\) in the following cross-sectional augmented Dickey–Fuller (denoted by CADF) regression:

$$\Delta s_{it} = a_{i} + b_{i} s_{i,t - 1} + c_{i} \overline{s}_{t - 1} + d_{i} \Delta \overline{s}_{t} + e_{it} ,$$
(7)

where \(s_{it}\) is again the log-city share of city i in year t, \(a_{i}\) is the individual city-specific average growth rate and \(\overline{s}_{t}\) is the cross-section mean of \(s_{it}\), \(\overline{s}_{t} = N^{ - 1} \sum\nolimits_{j = 1}^{N} {s_{jt} }\). To eliminate cross-dependence, standard Dickey–Fuller (or augmented Dickey–Fuller) regressions are augmented with the cross-section averages of lagged levels and the first differences of the individual series, such that the influence of the unobservable common factor is asymptotically filtered. The null hypothesis assumes that all series are non-stationary, and Pesaran’s CADF is consistent under the alternative that only a fraction of the series is stationary. Nevertheless, the test does not allow for structural breaks in the series.

Unfortunately, due to computational limitations, we cannot run the test using the full sample of cities; we must restrict our analysis to a sub-sample of the largest cities considering groups from the 50 to the 500 largest cities. Nevertheless, this sample of largest cities represents around two-thirds of the total country population in all years. Black and Henderson (2003) and González-Val and Lanaspa (2016) highlighted one potential issue of sample selection in a dynamic framework: if the sample of cities is defined according to the largest units in the latest year, the analysis may be biased because these are the “winning” cities, namely those that have presented the highest growth rates over time. To deal with this potential problem, we define the groups of cities using two samples including the largest cities in the first (1880) and last (1970) years. Although the latter group may only include winning cities, the former group can also include information about cities that were important in 1880 but declined over time.

Table 4 shows the results of applying Pesaran's panel unit root test.Footnote 9 Panel A reports the results for the sample of largest cities in 1880. We find that the null hypothesis of a unit root is not rejected in most of the cases. The unit root is only rejected in two cases, for the groups including the 50 and 100 largest cities in 1880, when three lags and a trend are added. Panel B shows the results using a sample of the largest cities in 1970. In this case, the unit root hypothesis is not rejected in any case, even at the 10% significance level. Overall, both panels present strong evidence supporting a unit root in these samples of large cities, in line with the results obtained using time series analysis in the previous section. As a robustness check, we re-estimated the test using 500 random samples of cities. For each sample, we draw 500 cities without replacement from the full sample of 2680 Belgian cities regardless of their size. We could not reject the null hypothesis of a unit root in any case for any model specification. Again, these results (not shown) strongly support random growth.Footnote 10

Table 4 Panel unit root tests, Pesaran’s CADF statistic

Finally, in Table 5 we report the results from the Pesaran’s (2004) test for cross-sectional dependence (CD) for the same subsamples considered in Table 4. Results using the Frees’ (1995) test (not shown) are similar. The null hypothesis of cross-sectional independence is rejected in all cases, confirming the need for using a panel unit root test controlling for cross-sectional dependence.

Table 5 Pesaran’s CD test for cross-section dependence in panel data

4 Conclusions

We examine urban growth in Belgium from 1880 to 1970 using unit root tests to check random growth in the long term. Our results add to the scarce literature on unit root testing in city sizes (Clark and Stabler 1991; Sharma 2003; Bosker et al. 2008), and, to our knowledge, this is the most comprehensive test of Gibrat’s law using unit root tests ever carried out as we consider annual data for all municipalities.

Using both time series and panel data unit root tests, we obtain strong validation of the random growth hypothesis, that is, Gibrat’s law, which implies that urban growth is independent of the initial city size. This evidence supports a multiplicative growth process of cities in Belgium, and this kind of growth is consistent with many theoretical urban economics models (Gabaix 1999; Eeckhout 2004; Duranton 2007; Córdoba 2008). Nevertheless, even if city shares follow a unit root, this growth process is compatible with a degree of convergence in the evolution of city growth rates; that is, with some kind of mean-reverting component (Gabaix and Ioannides 2004).

The long-term pattern of random growth does not imply that the city size distribution has remained static over the years. On the contrary, a unit root implies that all shocks have had permanent effects on the city share, and, in particular, when allowing for structural breaks, we find that exogenous historical shocks had a permanent effect on city shares: the timing of the structural breaks coincides with some major historical events, such as the World Wars and the economic crisis of 1929–1933.

The alternative explanations for random growth considered in the literature are, basically, locational fundamental theories and increasing returns to scale (Davis and Weinstein 2002). Although our results are not specifically a test of random growth versus locational fundamentals or random growth versus increasing returns to scale, the strong support obtained for random growth clearly cast some doubts on the relevance of the two alternative theories in the Belgian case in the long-term.

This evidence is consistent with previous research for other countries. To give some examples, González-Val et al. (2014) found that random growth (i.e., Gibrat’s law) holds for most cities in the US, Spain, and Italy, and Chauvin et al. (2017) concluded that Brazil and the US both appear to adhere broadly to Gibrat’s law, but China and India do not. As random growth is a long-term pattern, it involves that the city size distribution reaches a dynamic steady state. For this reason, Chauvin et al. (2017) explained that Gibrat’s law holds in Brazil and the US because both are moderately sized places, which have long been largely urban, while China and India are much larger, and many of their cities are newer. In terms of our study using Belgian cities, it is natural that random growth emerges as the pattern of growth in the long-term, considering similar arguments. First, Belgium is a small-sized country. Second, as explained in the Introduction, both the industrialization and the completion of transport infrastructures took place at the beginning of the sample period (in the late nineteenth and early twentieth centuries), implying that urbanization of the country began in the early twentieth century. Third, as in many European countries, most Belgian cities date from ancient times, so the urban system is consolidated.

However, our results raise new research questions. Our analysis captures the long-term pattern of growth, but omits the dynamics towards steady state. In the short-term, agglomeration economics and locational fundamentals may have been important in the transition to spatial equilibrium. In Chauvin et al. (2017) aspects such as mobility, rent-earnings relationships or skill-related factors can influence the steady state configuration. Further study on these economic factors could help to improve our knowledge about urbanization in Belgium and shed some light on the population dynamics in countries currently experiencing the urbanization process.