Towns (and villages): definitions and implications in a historical setting


The measurement of urbanization rates and other uses of statistical information, for example the use of historical town growth to measure long-term economic growth, are usually based on an ad hoc population threshold to define and practically classify settlements as towns. The method, however, trades off accuracy and precision for convenience and simplicity. This paper proposes a new threshold that uses the town size distribution together with agricultural data to derive an appropriate cutoff value. The relevance of agricultural income is integrated into the classification scheme through the differential effect of local agricultural endowments on settlement size. The threshold is chosen such that the size of towns above the cutoff is statistically not influenced by local agricultural endowments, while the size of villages, which is below the threshold, is indeed shaped by them. This new approach is practically demonstrated with an application to the urban system of the nineteenth century in the German region of Saxony. This setting is used to investigate the relevance of a different classification for the development of urbanization over time and Gibrat’s law. The results demonstrate that the underlying classification scheme matters strongly for the conclusions drawn from historical urban data. They also indicate that the use of a common population threshold for a comparative analysis or temporal comparisons in a historical context increases the misclassifications of settlements.


What is a townFootnote 1? Although they are often used in the economic history and the wider economics literature as the quintessential example for agglomeration economies, it is rarely addressed what exactly constitutes a town. The frequent use of this concept obviously implies applied classification schemes, but the underlying definitions on which these are based are rarely made explicit. Although a distinction between rural and urban settlements is rarely disputed, the exact nature of this split is very much up in the air. Here I extend the standard ad hoc population threshold approach, demonstrate the use of data to establish the separating threshold, and show that the exact nature of the split matters for the lessons drawn from population data.

A formal distinction matters for a number of contexts. One of them is the use of towns and town-based indicators, for example the urbanization rate, in the economics literature. There they are utilized extensively as observations and proxies for other economic variables, most notably general economic growth. Urban population growth has been used to determine the growth effects (or non-effects) of a whole range of different factors; examples range from governance structures (De Long and Shleifer 1993), Atlantic trade (Acemoglu et al. 2005), the invention of the printing press (Dittmar 2011), or the introduction of the potato in Europe (Nunn and Qian 2011).

Many studies, including those mentioned above, use a simple definition of towns as settlements with a population larger than a particular size threshold. The choice of this approach is mostly justified by the availability of particular town size data sets. The actual threshold is rarely addressed, and usually, it is simply the size used in the data sources. Studies focusing on European history are normally based on the data sets by Bairoch et al. (1988) and DeVries (1984), who use thresholds of 5000, respective 10,000.Footnote 2

Growth is, however, just one example where the literature utilizes towns. In other cases, the town size itself is the focus of attention. One example is the literature on the town size distribution looking at Zipf’s law, which describes the seeming regularity of the distribution (Zipf 1949; Gabaix 1999). Explanatory approaches for this size-rank relationship also include Gibrat’s law, which postulates the independence between size and growth of a town (Gibrat 1931; Eeckhout 2004). Other examples where town definitions are important are the patterns and explanatory mechanisms underlying the location of towns and villages (Ploeckl 2011) and the relationship between the two (Duranton 1998, 1999).

There are a number of approaches to define towns; the most relevant in this context are economic ones, focusing usually on production, and institutional ones, for example based on legal rights. A second step is to turn such a definition into a practical classification. The use of an ad hoc population size threshold is an example of how an economic definition is turned into a practical one (DeVries 1984). Instead of developing a new conceptual approach, I utilize the existing definition underlying the usual size threshold approach and improve the application of the practical classification. The central improvement is that the size threshold is no longer determined in an ad hoc fashion but derived from actual data. More specifically, I use the relative importance of (non-)agricultural production, one of the criteria underlying the definition, to achieve this. The new town population threshold is such that above it local agricultural endowments statistically do not influence location size, while below they do.

After a more extended discussion of this idea, I demonstrate its use in an actual example. In particular, the reliance of my approach on agriculture as an important local income source implies that this method is predominantly applicable to historical settings, and I therefore conduct a case study of Saxony for the historical time period of 1834 and 1871. This independent German state organized detailed population counts which are the basis of the data set covering all settlements, including small villages, for the time period. Additionally, detailed information about agricultural endowments is available for each location. The empirical tests result in a town threshold of 2970 inhabitants for 1834. This indicates that the usual size thresholds are missing to categorize a considerable number of places as urban at the onset of the Industrial Revolution. The results for further years show a substantial increase in the threshold up to 4940 in 1871, demonstrating the importance of the industrialization process for urban structures.

The second part details the implications of this approach by investigating the impact of different classifications in a number of ways using three sets of Saxons towns defined, respectively, by my new classification, a legal approach as well as the usual size threshold of 5000 inhabitants. The differing development of urbanization rates over time as measured with the three classifications implies that the lessons about the relative economic growth of regions drawn from such rates need to take into account the underlying urban definition. Similarly, the validity of Gibrat’s law for towns, the independence of a town’s growth rate and size, is confirmed to be contingent on the underlying set of towns.

Urban definition

The existence and use of practical definitions of towns, mostly based on size cutoffs, does not imply that much attention is spent on an underlying conceptual definition. For example, there is only a limited discussion about such a definition in urban economics textbooks or the Handbook of Urban economics (Henderson et al. 2004).Footnote 3 Economists focus much more on the size distribution, in particular on questions relating to Zipf’s law, the noted empirical regularity about size and rank. Some of these papers look at the effect of threshold (Malecki 1980; Guerin-Pace 1995); Rosen and Resnick (1980) even frame it in terms of a city definition, but the link is only tenuous and the focus remains on the implications for size-rank rules.

The new economic geography has a conceptual definition, taking rural settlement to be dispersed and fully agricultural, while towns are concentrated and usually only produce manufacturing goods, but does not really use this for a practical classification scheme (Fujita et al. 1999).

A separation into towns and villages, with villages defined as those locations which are not urban, is also a strong element in the economic history literature on urban history. For example, Bairoch in his work on the global history of cities and DeVries in his work on European urbanization during the early modern period discuss characteristics of cities, which leaves all other settlements to be villages (Bairoch 1988; DeVries 1984). Their definitions follow in large parts earlier work, in particular a characterization laid out by Weber in his work on the concept of the city (Weber 1920).

Broadly, there are two relevant approaches to define towns and consequently villages.Footnote 4 These are economic ones, focusing predominantly on population and production structures of settlements, and institutional ones, focusing on legalFootnote 5 and other formal institutions located within settlements.

One of the influential economic definitions is based on DeVries (1984), who takes up ideas developed by Weber (1920) and distills them into four practical, quantifiable criteria:

  • Population size.

  • Share of non-agricultural population.

  • Diversity of non-agricultural occupational structure.

  • Density of settlement.

All four of these are continuous indicators, so an empirical application needs to pick a threshold for each of these. Yet there are no hard rules about what values are sufficient such that a location is called a town (DeVries 1984). All four are concerned with the nature of the production process at the location, focusing on population to measure its extent and spatial density as well as the occupational structure to determine its diversity.

Another approach utilizes institutional characteristics of settlements, most prominently specific town rights. Such a legal town status can be a purely formal characteristic without any particular real-world consequences or it can actually cause an institutional difference between towns and villages. Examples are governance structures, tax and fiscal issues, rights to trade fairs, or also military and security issues. Such legal characteristics feature prominently in Europe’s urban history. One of the major problems to utilize such a definition, however, are structural differences between different regions; what passed for a town in England did not necessarily do so in German states.Footnote 6

This discussion focused until now on a conceptual definition of towns. The second component is to turn that into an actual classification, in particular to identify settlement characteristics suitable to empirically classify locations as towns or villages. The economic definition by DeVries is usually reduced to its first criterium, population size. DeVries uses 10,000 and Bairoch uses 5000 as threshold for their databases of historical town sizes in Europe.Footnote 7 Both are ad hoc thresholds without a definite justification for the value. The main argument is that they are high enough to find sufficient historical data, but small enough to minimize the error of misidentifying villages as towns. The legal definition appears to be more straightforward, and settlements either have town rights or they do not. However, locations with town rights might not have had all of them, while some villages might have had certain town rights, for example the right to hold trade fairs. A practical implementation therefore needs to select one particular legal or institutional characteristic.

My classification is based on the above-mentioned definition by DeVries (1984) with its four central criteria. The usual ad hoc size-based classification implies that the latter three are not taken into account. My new classification expands on this by incorporating information on the importance of agriculture, therefore explicitly covering the second criteria as well. Two issues remain, both criteria suffer from the same undefined threshold problem, and the historical share of non-agricultural population is often not as readily available as population size.

These issues are overcome in two steps. First, instead of using two separate criteria, I combine the two and use the relevance of agriculture for town income to inform the selection of a threshold for population size. And second, instead of using a sectoral occupation share, which requires essentially an occupational census, I utilize local agricultural endowments to derive a threshold. These endowments are quite consistent over time, so modern data are suitable, and usually cover substantial territories, which implies better data availability.

Agricultural endowments obviously influence extent, structure, and characteristics of local agriculture. Consequently, they influence local income and the size of the local population working in agriculture. The difference between villages and towns is now seen such that this influence on local agricultural-dependent population influences the whole settlement size of villages while there is no significant impact on the total population of a town through this particular population.

The main assumption is the relative importance of local endowments. Local implies the direct vicinity rather than the larger hinterland region. Endowment quality characterizes the land which is used by a settlement’s farmers for their production. This focus is necessary to link endowments directly with the income of the settlement population. These endowments essentially determine the extent of local agricultural production and therefore the population sustained by agriculture in this particular location. This implies that the size of settlements for which agricultural production is a major source of income, i.e., villages, should be affected by local endowments, as these influence agricultural population, while for all other settlements, i.e., towns, they should not really matter for the population. Technological change affects the precise nature of the link over time, but as long as the constraint relationship is consistent over space the approach underlying the definition holds.

If towns specialize in the production of non-agricultural goods, they obviously rely on trade with the countryside for the necessary food supply.Footnote 8 If this relationship between a town and its hinterland is assumed to be closed, like an island, then the regional agricultural conditions present a production constraint for total population. But these regional conditions are not necessary correlated with the local conditions around the town, so the number of urban residents is only restricted by the hinterland production capacity, not the town’s vicinity one. This idea, scaled up to whole urban systems, is used to measure countries’ agricultural productivity by the number of people one agricultural worker supplies food for (Wrigley 1985; Allen 2000). While this works for whole states, for individual towns the ability to trade with multiple villages and towns implies that the local agricultural production is not a binding constraint for the availability of food and therefore population size. Additionally, there are other, potentially dominating, non-agricultural constraints on town population, ranging from public health and mortality to institutional restrictions like the influence of guilds (Duranton 1999). This implies that for towns agricultural production does not significantly matter either as a constraint on or indirectly as income source for a significant share of their total population.

Practical implementation

Following this conceptual discussion, this section presents the practical application to a historical case, namely Saxony in the early nineteenth century. The example demonstrates how such a threshold can be derived.Footnote 9

Saxony 1834

The main data required to determine the threshold value are the size distribution of all settlements. I utilize a recently collected data set,Footnote 10 which covers Saxony in the middle of the nineteenth century. In particular, it contains the years 1834, 1843, 1852, 1861, and 1871, covering the time period between Saxony’s entry into the Zollverein, the 1834 customs union between German states, and the loss of its independence with the founding of the German empire in 1871. Saxony was a midsized German state with a long history. The central territories, which are part of this data set, were under its control for about four centuries, in parts even longer. It was a center of international trade, controlled a number of trade routes, and had a major trade fair at the city of Leipzig. Its economic structure was fairly diverse, in 1849 agriculture employed 32 % of the population, crafts and manufacturing 51 %, trade 3 %, transport 2 %, art and science-related occupations 4 %, and household servants 3 % (Bureau 1854).

Saxony was one of the early industrializing German states, entering the industrial revolution in the first decades of the nineteenth century (Kiesewetter 2007; Forberger 1982). As a center of the industrialization process with Germany, its economy changed substantially during the four decades covered here. Since industrialization was especially an urban phenomenon, this transformation also changed the particular nature of the production process, and consequently occupations and incomes, in the towns and cities within Saxony.

Saxony was relatively densely settled in comparison with other German and European states (Kiesewetter 2007). The main settlement process had started by the fourteenth century, and by the sixteenth century the set of settled locations was quite stable (Blaschke 1967). Furthermore, its local governance structures included the legal institution of township, granting a number of settlements specific institutional structures and rights. The government reformed municipal governance in 1832 and classified locations directly into towns and villages. This initially affected town governance and economic regulations, for example taxation, but with further economic and administrative changes its importance faded during the nineteenth century and was fairly meaningless by the time of the empire in 1871. The existence of this legal classification, however, allows the comparison of size-based approaches with an institutional one. Furthermore, the nature of its settlement process was such that individual homesteads clustered strongly into villages and towns; settlements had therefore clear boundaries with people’s houses not being located on their respective farming land. The conclusion of the process some two centuries before the time in question implies that there were no significant parts of the state which were not settled or not under some form of cultivation (Blaschke 1967).

Although the tax system allowed for some population calculations prior, the first systematic population count was conducted in 1834 as a consequence of Saxony’s entryFootnote 11 into the Zollverein customs union.Footnote 12 Counts were then conducted in a 3-year rhythm until the empire in 1871. At the turn of the twentieth century, officials of the Saxon statistical office published short histories of Saxon towns and villages in the nineteenth century including all available population data (Waechter 1901; Lommatzsch 1905), which are used for this data set. In total, it contains 3579 locations, as depicted in Fig. 1, with a total population of 1.60 million in 1834, which increased to 2.56 million in 1871. The location population ranged in 1834 from 6 to 73,610 with an average of 447 and a median of 201. By 1871, this had changed to a range from seven to 177,100 with an average of 716 and a median of 261.

Fig. 1

Settlements in Saxony 1834. The map depicts all settlement locations within Saxony in 1834

Each location is referenced with geographic coordinates,Footnote 13 usually a central spot, which allows to link it with a number of geographic data, including land quality. The main component is information about the aptitude of local land for agricultural and pastoral purposes, which are based on extensive field surveys undertaken by Saxon governments in the middle of the twentieth century (Blaschke and Klasse 1998). The survey combined various soil, water and climate indicators to derive maps of land quality, in particular containing separate values for the suitability for farming as well as pasture purposes. The results were then used by the Saxon authorities to create an average value for each current municipality, as depicted in Figs. 2 and 3. There are about 1600 of those, which implies that each observation combines the average for just over two villages. The survey classifies all locations with values for farming as well as pasture purposes on a scale from zero to 100. Additionally, data about elevation, ruggedness,Footnote 14 the proximity of a river, the proximity of the Elbe, and the main navigable river are included in the data set as geographic controls.Footnote 15

Fig. 2

Farm land quality. The map shows the farm land quality with values ranging from 0 (black) to 100 (white)

Fig. 3

Pasture land quality. The map shows the pasture land quality with values ranging from 0 (black) to 100 (white)

Distributional characteristics

A classification into towns and villages implicitly assumes that the complete distribution can be separated into two sections. A size-based definition obviously simply cuts the size distribution at the threshold value, while one based on legal town status does not exclude the possibility of the smallest towns being smaller than some villages. Nevertheless, in both cases the full set of settlements is separated into two distinct ones.Footnote 16

Although such a break does not necessarily go against approaches which look at the distribution as a whole (Eeckhout 2004), it fits better with those that focus on subsets (Levy 2009). Two of the main arguments Eeckhout (2009) presents for the former are the statistical conformity of the empirical distribution and the indeterminacy of the breakpoint. The selection of the size threshold based on agricultural endowments creates a data-based breakpoint, thereby addressing the second point. Additionally, I also investigate the first issue. Eeckhout uses modern US data when he demonstrates that the size distribution of all settlements fits a log-normal distribution. I test the same for Saxony for a number of years between 1834 and 1871, and in all cases, the hypothesis can be rejected with a 99 % confidence level. This rejection implies that the size distribution of Saxon settlements did not follow a log-normal distribution, which negates the applicability of the underlying theoretical approach of Eeckhout (2004). This points toward the suitability of an approach that segments the total distribution into at least two classes.

Determining the threshold

Taking up the idea developed above, I apply this classification approach by using the following empirical specification

$$\begin{aligned} \log \left( {Pop} \right) & = \,\propto + \beta_{f} Farm + \beta_{p} Pasture + \beta_{fp} Farm \times Pasture \\ & \quad + \delta L\arg e_{t} + \gamma_{f} Farm \times L\arg e_{t} + \gamma_{p} L\arg e_{t} + \gamma_{fp} Farm \\ & \quad \times Pasture \times L\arg e_{t} + \lambda_{E} Elevation + \lambda_{RU} Ruggedness \\ & \quad + \lambda_{RI} River + \lambda_{S} Elbe + \varepsilon \\ \end{aligned}$$

where Pop is the size of each settlement in the particular year in question and Farm and Pasture represent the location’s soil quality with regard to farming and pasture purposes. Large t is a dummy variable indicating whether the location in question is larger than the size threshold t. The remaining four terms are geographic controls for elevation, ruggedness, rivers and the Elbe, respectively. The estimation is done separately for a range of possible size threshold values.

This specification determines the size threshold \(\hat{t}\) such that the influence of local agricultural endowments on the size of the settlement is different above and below it.Footnote 17 This is formally done by using an F-test for the joint hypothesis that β i  + γ i  = 0 for i = f, p, fp. If the hypothesis is not rejected, the test indicates that the combined effect of the agricultural variables on the size of settlements below the threshold is offset for those settlements above the line. Since the test in itself only indicates whether the combined effects offset each other, another test is necessary to see whether there is an effect below the threshold, above the threshold, or in neither of the two. If the hypothesis of the offset of the joint effect cannot be rejected, a F-test rejecting the joint hypothesis that β i  = 0, for i = f, p, fp is sufficient to show that there was an effect of agricultural variables below the threshold but not above.

The estimation of the specification and the hypothesis tests are repeated for a range of threshold size values. I test for the full range for values from 1000 to 5000 in steps of 10.Footnote 18 This incremental change reveals the largest settlement size satisfying these two tests, the existence of an effect below the threshold and the offset above, quite precise.

Figure 4 shows the results for Saxony’s population data of 1834 by plotting the p value of the offset test against the respective threshold value. The horizontal line indicates a p value of 0.05, which intersects the p value graph at a value of 2970. The test about the effect of agricultural variables on the size of settlements below the threshold rejects the hypothesis of no effect with a 99 % significance level. These results imply that in 1834 locations below 2970 inhabitants show a statistically significant influence of agricultural endowment variables on location size, while those above do not. This difference is also given in Table 1, which contains the results of estimations of size on agricultural endowments and geographic controls for all locations, villages and the towns implied by this threshold. The pattern clearly shows an impact for villages but not for towns. Repeating the analysis with the Farm respective Pasture values as the only agricultural endowment variable results in thresholds of 2980 and 2990, as demonstrated in online appendix,Footnote 19 indicating the robustnessFootnote 20 of the analysis.

Fig. 4

Urban threshold 1834. The figure plots for the year 1834 threshold values used in the estimation versus the p values for the joint test that the combined agricultural endowments matter for location size below the respective threshold but not above

Table 1 Influence of agricultural endowments on location size

Consequently, an agricultural endowment-based definition of towns and villages implies a comparatively low size threshold for this setting. For Saxony in the early nineteenth century, this classification scheme indicates 57 settlements to be towns, which represent 1.6 % of all locations. The average size of these towns is 7094, with a median of 4297. With a combined population of 404,357, this implies a degree of urbanization of 25.8 %.

The resulting threshold is a credible value, on the lower side but still clearly above any potential lower bound, especially also in comparison with legally defined towns which have a median size just above 2000 inhabitants. This also implies that the relative size of the type of error committed changes. One of the reasons behind the selection of the ad hoc threshold is to avoid type II errors, namely falsely identifying a settlement as a town when it fact it is a village (DeVries 1984). This obviously neglects type I errors, namely failing to identify settlements as towns when they actually are. The resulting new threshold clearly shifts the focus toward a more balanced view on the two errors for the time period until the Industrial Revolution.

The same empirical method is used to determine the set of towns implied by the population distribution and agricultural endowments for other years. Figure 5 shows the respective graphs of p values for the years 1843, 1852, 1861 and 1871. The implied threshold values increased substantially over the time, raising from 2970 to 3620 by 1843, then to 3950 in 1852. Nine years later in 1861, the value is 4150, and at the end of the period in 1871, the threshold is 4940. Over the course of the Industrial Revolution, the threshold increased consistently from about 3000 to 5000, the value commonly used in the ad hoc thresholds.

Fig. 5

Urbanization threshold over time. The figure depicts the results shown in Fig. 4 for additional years, namely 1843, 1852, 1861 and 1871

This increase in the size threshold over the main periods of industrial development in the region has the implication that some locations were considered to be towns at the onset but then got reclassified as villages at the end of the period. More precisely, the number of towns drops from 57 in 1834–47 in 1843 before it rises again to 54 in 1852, 56 in 1861 and 57 in 1871.

A possible explanation for this phenomenon is an increasing specialization between the agricultural dominated countryside and industrializing core urban areas. Smaller locations with a diversified occupational structureFootnote 21 before the Industrial Revolution might not have been able to compete with products produced in fully industrializing large cities and consequently might have shifted their production toward direct an agricultural focus as well as related production and procession steps.


The new definition results in a considerable different town population threshold than commonly used. But does it actually matter? The following section explores the implications of different definitions in two respects. The first is the development of urbanization over time. The second looks at the growth of towns. The literature on the town size distribution shows that Gibrat’s law, the independence of relative growth and size of a town, can explain the existence of Zipf’s law. I contrast whether a basic test of Gibrat’s law holds for the different sets of towns. Both statistics are not designed to prove that one way to determine towns is necessarily better but they can demonstrate what difference the underlying classification scheme actually makes.


The commonly accepted starting point for Saxony’s industrialization is the early nineteenth century (Kiesewetter 2007; Forberger 1982). The onset of such a transformation had usually strong effects on the development of urban centers, and a prime example is Lancashire in England. Although there was no Saxon town which reached the size of these large urban agglomerations in Britain, they were recognized as major centers of industrialization within German territories. Chemnitz’s nickname as the “Saxon Manchester” is a prime example for this. One indicator for this industrial development is the rate of urbanization.

The measured degree of urbanization is obviously impacted by the definition of towns. I compare the rates based on different classifications to illustrate the effect of changing the definition. This comparison not only looks at the 1834 level but also uses the thresholds values to calculate the urbanization rate for the other time periods mentioned above. The resulting degree of urbanization is compared to those based on the size threshold of 5000 inhabitants and the legal definition in Fig. 6 which plots the calculated rates using the different concepts.

Fig. 6

Urbanization rates. The figure shows the implied urbanization rates for three classification: the new one, the legal definition and the size threshold of 5000

Since the new thresholds are between the low values of the legal definition and the high one of the 5000 ad hoc line, the respective urbanization rate is ordered inversely. As demonstrated in Fig. 6 especially in 1834, the new measure is fairly close to the middle between the other two measures, indicating the compromise between the overidentification of towns through town rights and the underidentification of the 5000 threshold.

The paths of the graphs over the three decades show two particular characteristics, first the new measure and the 5000 threshold converge and second both of them growth faster than the rate implied by the legal definition. The convergence between the two rates is obviously due to the new measure threshold values increasing over the time period. This is an indication that the size threshold of 5000 might be a good value for urban systems after a substantial part of the transformations of the Industrial Revolutions has taken place. The substantial difference in growth pattern between these two size-based approaches and the legal definition consequently indicate that the latter might be a better fit for the time period leading up to the industrial revolution as its static nature does not fit well with the changes caused by industrialization.

This pattern shows that any long-term analysis of economic development using urbanization rates faces the issue that the changes in the relationship between the economy and the urban system might obscure the actual underlying economic development.

The differing thresholds over time furthermore have an important implication for the comparison of urbanization between different regions, namely the use of the same threshold may not be appropriate since it introduces systematically different errors in classifying towns between the different regions in question. If two regions have substantially different relationships between the economy and urbanization, for example if only one region has started to industrialize, then the resulting urbanization rates are misleading when used to compare the underlying economic conditions.


Shifting from total urban population to the size of individual towns, I investigate the implications for the growth of towns. A major strand of the literature on town growth focuses on the statistical properties of the size distribution, in particular Zipf’s law. A central explanation for the emergence of this regularity is Gibrat’s law, which postulates that the size and growth of a town are independent. I test whether this proportionate growth hypothesis holds for the different sets of settlements classified as towns.Footnote 22 The time frame of this analysis is from 1834 to 1871, taking the growth over this period as the dependent variable. This strikes a balance between a sufficiently long period to smooth out the impact of short-term fluctuations and ends early enough to avoid the issue of incorporations of villages and small towns into larger cities. The following cross-sectional specification is estimated separately for each set of towns as defined by the three definitions:

$$\begin{aligned} \frac{{Pop_{1871} }}{{Pop_{1834} }} & = \alpha + \beta Pop_{1834} + \lambda_{E} Elevation + \lambda_{RU} Ruggedness \\ & \quad + \lambda_{RI} River + \lambda_{S} Elbe + \varepsilon \\ \end{aligned}$$

Table 2 shows the results for the three estimations, illustrating the differences between the newly developed one with the legal as well as the 5000 threshold one. The hypothesis of proportional growth, which implies β = 0, can be rejected for the towns classified on the new threshold as well as legal town rights, while it cannot be rejected for towns based on the 5000 inhabitant size thresholds.Footnote 23 This difference confirms that the acceptance or rejection of the hypothesis is contingent on the data sample and therefore the underlying principles informing its creation. Furthermore, a simple separation of the whole distribution into two categories might not catch the full heterogeneity of the growth patterns, and a more granular look at different parts of the distribution might be informative and advisable.

Table 2 Town growth on town size


Towns, either as direct object of study or as indicator for other economic characteristics, have long been the subject of interest for economists and economic historians. Empirical analysis, however, requires a classification scheme that defines what exactly constitutes a town. Numerous studies, especially on long-term growth in European history, use ad hoc size thresholds, especially the number of 5000 inhabitants. However, authors who compiled historical population data for Europe indicated that this likely misses out on a considerable number of urban areas with population numbers below this size. This paper uses the case of one European state, Saxony, to empirically find an appropriate size boundary utilizing the advantages of a historical setting. The underlying idea to determine the threshold is the (missing) impact of agricultural endowments on settlement size. The results suggest that at the onset of the Industrial Revolution a lower boundary, in the vicinity of 3000 inhabitants, reflects the conceptual definition underlying the separation into towns and villages more precisely.

The method requires two major assumptions to be applicable, namely villages and towns need to be meaningful individual economic units and agriculture needs to be an important income source for rural population. These conditions apply to European states before the shift of labor out of agriculture concluded at some point during in the twentieth century, but will also be satisfied by many regions around the world, in some cases even by today.

This extension admittedly has higher data requirements,Footnote 24 but it demonstrates that the use of a simple, high and common threshold for multiple areas not only neglects a considerable part of the total urban population, but also misses adjustments for different urbanization thresholds in different regions. The results, especially the notion of changing thresholds over time, do suggest that the boundary for urbanization is not static, but does vary, and therefore also varies between regions. Particularly, the changes brought by the industrialization process seem to have a strong impact, indicating a transformation of the relationship between the economy and the urban system.

This also implies that this modification of the size threshold approach has two main advantages. First, it opens up the possibility to compare urbanization developments based on a more precise determination of the rate and avoid the differential errors introduced by assuming an identical threshold rate for all regions. Secondly, observing the pattern of size thresholds over time allows inference about the development of the urban system and especially the structural transformations of the economy through industrialization or the rise of services.

Urbanization rates and Gibrat’s law are only two examples where economic implications are deduced from urban data. The results here demonstrate a difference not only between different size-based approaches but also between conceptually different methods, in particular size-based and institutional ones.

Other possible areas of application are spatial patterns of population and consequently economic activity. Preliminary calculations using spatial information about the villages indicate that the relevance of endowments factors underlying the spatial structure of the urban system changes depending on the utilized classification scheme. This has implications for the debate about the relative role of endowments and market access for urban structures. Similarly, the relative patterns of villages and towns, for example do the first cluster around the latter, may shift, impacting the conclusions drawn about local agricultural trade and industrialization patterns.

These instances imply consequently that studies using urban data as indicators need to address the ramifications of their choice of an urban definition. One possible consequence could be to include tests for heterogeneous effects, either based on different urban definitions or coming from the data. The empirical implications indicate that such tests for heterogeneous effects might also offer opportunities to develop urban definitions and classification schemes based on the ideas and concepts underlying the theories investigated with urban data.


  1. 1.

    Although the terms town and city have different connotations in certain contexts, I do not make this distinction here and will use the terms interchangeably.

  2. 2.

    These data sets, especially Bairoch’s, contain some information about the population of included towns when these were below the 5000 threshold, which is used in some studies. Besides selection issues, there is also the problem of precision since Bairoch rounds the numbers to full thousands.

  3. 3.

    Historically this seems to be similar in other fields, for example sociology (Martindale 1958).

  4. 4.

    There exist more than that, for example Tilly (1976) develops another one. It focuses on concepts of market structure and relationships.

  5. 5.

    See Cantoni and Yuchtman (2014) for some historical context on the development of these rights in Germany and Ploeckl (2012) for an application using such a concept.

  6. 6.

    Such a classification has been used focusing on one country only, for example Ploeckl (2010a).

  7. 7.

    They do attempt to provide the size of towns for the time before they reach the thresholds, which gives from some authors using these data the appearance of a lower threshold.

  8. 8.

    The production of non-agricultural goods can include food processing just not the initial production of food inputs.

  9. 9.

    The specific relevance of the two criteria that are not used in the classification process is discussed in online appendix Sect. 2.

  10. 10.

    The set is based on data used in Ploeckl (2010a).

  11. 11.

    See Ploeckl (2015) for a discussion of this entry.

  12. 12.

    The revenue distribution scheme of the Zollverein was based on a state’s population; the states therefore agreed to consistent methods for population counts (Hahn 1984; Henderson 1984; Ploeckl 2010b).

  13. 13.

    These are either official coordinates from the Saxon Landesvermessungsamt, from a historical place register or selected by the author by inspecting various maps (Blaschke and Baudisch 2006).

  14. 14.

    Ruggedness is measured as the standard deviation of elevation within a 2-km radius.

  15. 15.

    Data sources are described in online appendix Sect. 1.

  16. 16.

    Studies have highlighted the special role of the largest town, often the capital (Ades and Glaeser 1995). Since this concerns only one specific town, the impact is limited and therefore negligible in this context.

  17. 17.

    Table 1 establishes that endowments have an effect on the size of all villages.

  18. 18.

    As a robustness test I also tested for values up to 10,000, the results did not change.

  19. 19.

    Conducting the analysis without the additional geographic variables results in a somewhat lower threshold below 2000 inhabitants; the details are shown in online appendix as specification III.3.

  20. 20.

    Clustering the observations according to modern parish boundaries leads to a range of threshold values between 3360 and 4580 for the different specifications, see figures A4 to A7 in online appendix. Due to the very large number of observations that are a single location in a modern parish, the regular estimation is the preferred one.

  21. 21.

    Waechter (1901, p194) suggests that a number of smaller towns had specialized in some non-agricultural sectors but did lose that specialization over the course of the nineteenth century. This observation fits well with the proposed explanation.

  22. 22.

    A number of studies, for example Rosen and Resnick (1980), do find an impact of a different size threshold on these statistical regularities, which implies that this section predominantly confirms results known to the literature.

  23. 23.

    Applying a 10 % significance level the hypothesis of proportional growth can be rejected for a town size threshold below 3430.

  24. 24.

    See online appendix for an alternative specification that has somewhat lower requirements but has fairly similar results.


  1. Acemoglu D, Johnson S, Robinson J (2005) The rise of Europe: Atlantic trade, institutional change and growth. Am Econ Rev 95(3):546–579

    Article  Google Scholar 

  2. Ades AF, Glaeser EL (1995) Trade and circuses: explaining urban giants. Q J Econ 110(1):195–227

    Article  Google Scholar 

  3. Allen RC (2000) Economic structure and agricultural productivity in Europe, 1300–1800. Eur Rev Econ Hist 4(01):1–25

    Article  Google Scholar 

  4. Bairoch P (1988) Cities and conomic development: from the dawn of history to the present. University of Chicago Press, Chicago

    Google Scholar 

  5. Bairoch P, Batou J, Chevre P (1988) Population des villes europeennes de 800 a 1850. Droz

  6. Blaschke K (1967) Bevoelkerungsgeschichte von Sachsen bis zur industriellen revolution. Weimar, Boehlau

    Google Scholar 

  7. Blaschke K, Baudisch S (2006) Historisches Ortsverzeichnis von Sachsen Leipziger Universitaetsverlag

  8. Blaschke K, Klasse PH (1998) Atlas zur Geschichte und Landeskunde von Sachsen. Verlag der Saechsischen Akademie der Wissenschaften, Leipzig

  9. Bureau Statistisches (1854) Die Bevoelkerung des Koenigreichs nach Berufs- und Erwerbsclassen und Resultate der Gewerbs-Geographie und Gewerbs-Statistik von Sachsen. Statistische Mittheilungen aus dem Koenigreich Sachsen, 3

  10. Cantoni D, Yuchtman N (2014) Medieval universities, legal institutions and the commercial revolution. Q J Econ 129(2):823–887

    Article  Google Scholar 

  11. De Long JB, Shleifer A (1993) Princes and merchants: city growth before the industrial revolution. J Law Econ 36(2):671–702

    Article  Google Scholar 

  12. DeVries J (1984) European urbanization, 1500–1800. Harvard University Press, Cambridge

    Google Scholar 

  13. Dittmar JE (2011) Information technology and economic change: the impact of the printing press. Q J Econ 126(3):1133–1172

    Article  Google Scholar 

  14. Duranton G (1998) Labor specialization, transport costs, and city size. J Reg Sci 38(4):553–573

    Article  Google Scholar 

  15. Duranton G (1999) Distance, land, and proximity: economic analysis and the evolution of cities. Environ Plan A 31(12):2169–2188

    Article  Google Scholar 

  16. Eeckhout J (2004) Gibrat’s law for (all) cities. Am Econ Rev 94(5):1429–1451

    Article  Google Scholar 

  17. Eeckhout J (2009) Gibrat’s law for (all) cities: reply. Am Econ Rev 99(4):1676–1683

    Article  Google Scholar 

  18. Forberger R (1982) Die industrielle revolution in Sachsen 1800–1861. Steiner Verlag, Stuttgart

    Google Scholar 

  19. Fujita M, Krugman PR, Venables A (1999) The spatial economy: cities, regions and international trade. MIT Press, Cambridge

    Google Scholar 

  20. Gabaix X (1999) Zipf’s law and the growth of cities. Am Econ Rev 89(2):129–132

    Article  Google Scholar 

  21. Gibrat R (1931) Les inegalites economiques. Librairie du Recueil Sirey, Paris

    Google Scholar 

  22. Guerin-Pace F (1995) Rank-size distribution and the process of urban growth. Urban Stud 32(3):551–562

    Article  Google Scholar 

  23. Hahn H-W (1984) Geschichte des Deutschen Zollvereins. Vandenhoeck und Ruprecht, Goettingen

    Google Scholar 

  24. Henderson WO (1984) The Zollverein, vol 3. F. Cass., London

    Google Scholar 

  25. Henderson JV, Nijkamp P, Thisse JF (2004) Handbook of regional and urban economics: cities and geography. Elsevier, Amsterdam

    Google Scholar 

  26. Kiesewetter H (2007) Die Industrialisierung Sachsens: ein regionalvergleichendes Erklaerungsmodell. Franz Steiner Verlag, Stuttgart

    Google Scholar 

  27. Levy M (2009) Gibrat’s law for (all) cities: comment. Am Econ Rev 99(4):1672–1675

    Article  Google Scholar 

  28. Lommatzsch G (1905) Die Einwohnerzahlen der Landgemeinden von 1834 bis 1900 und die Veraenderungen in der Verwaltungseinteilung des Koenigreiches seit 1815. Z Koeniglichen Saec Stat Landesamtes 51(1):12–91

    Google Scholar 

  29. Malecki EJ (1980) Growth and change in the analysis of rank-size distributions: empirical findings. Environ Plan A 12(1):41–52

    Article  Google Scholar 

  30. Martindale D (1958) Prefatory remarks: the theory of the city. Max Weber, The City, pp 9–67

    Google Scholar 

  31. Nunn N, Qian N (2011) The potato’s contribution to population and urbanization: evidence from an historical experiment. Q J Econ 126(2):593–650

    Article  Google Scholar 

  32. Ploeckl F (2010a) Borders, market size and urban growth, the case of Saxon towns and the Zollverein in the 19th century. Institut d’Economia de Barcelona working paper, 966

  33. Ploeckl F (2010b) The Zollverein and the formation of a customs union. Oxford university discussion papers in economic and social history, 84

  34. Ploeckl F (2011) Space, settlements, towns: the influence of geography and market access on settlement distribution and urbanization. Mimeo, Oxford University

  35. Ploeckl F (2012) Endowments and market access; the size of towns in historical perspective: Saxony 1550–1834. Reg Sci Urban Econ 42(4):607–618

    Article  Google Scholar 

  36. Ploeckl F (2015) The Zollverein and the sequence of a customs union. Aust Econ Hist Rev 55(3):277–300

    Article  Google Scholar 

  37. Rosen KT, Resnick M (1980) The size distribution of cities: an examination of the Pareto law and primacy. J Urban Econ 8(2):165–186

    Article  Google Scholar 

  38. Tilly C (1976) Vendee: a sociological analysis of the counter-revolution of 1793. Harvard University Press, Cambridge

    Google Scholar 

  39. Waechter G (1901) Die saechsischen Staedte im 19. Jahrhundert. Z Koeniglichen Saec Stat Landesamtes 47(1):179–232

    Google Scholar 

  40. Weber M (1920) Die Stadt. Mohr, Tübingen

  41. Wrigley EA (1985) Urban growth and agricultural change: England and the continent in the early modern period. J Interdiscip Hist 15(4):683–728

    Article  Google Scholar 

  42. Zipf GK (1949) Human behavior and the principle of least effort. Addison-Wesley Press, Cambridge

    Google Scholar 

Download references


I want to thank Bob Allen, Rui Esteves and James Fenske for helpful discussions as well as seminar audiences at Oxford and conference audiences at the Sound Economic History Meeting, as well as the APHES, EEA and EHES meetings.

Author information



Corresponding author

Correspondence to Florian Ploeckl.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 119 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ploeckl, F. Towns (and villages): definitions and implications in a historical setting. Cliometrica 11, 269–287 (2017).

Download citation


  • Towns
  • Villages
  • Geography
  • Definition
  • Classification
  • Town size

JEL Classification

  • N93
  • B49
  • O13
  • R12