1 Introduction

In the later part of the fourth millennium BCE, the first two states in history arose in Mesopotamia and Egypt. During the subsequent millennia, pristine states would independently emerge also in the Indus Valley, China, Mesoamerica and Peru (Spencer, 2010; Borcan et al., 2018). The transition from small-scale societies to states brought about fundamental transformations such as urbanization, taxation, public architecture, writing, and bureaucratic structures that would drastically alter the conditions of human existence.

A very large literature has dealt with the question why these original states appeared where and when they did. A less often studied question concerns the economic factors influencing the subsequent early development and consolidation of these states, a development that arguably worked as models for numerous later states in history. All of the early states were characterized by highly intensified agricultural production systems, a (relatively) dense population, and large-scale public monuments. But why did centralized political power sometimes prosper for centuries in some places but then stagnate, fragment and even collapse?

In this paper, we focus on the dynamics of early state development in response to exogenous climate shocks. We contribute to the literature on early state development by analysing systematic time series data from historical and archaeological records on political stability and on climate shocks from one of the longest-lasting early states in history: Ancient Egypt. In the spirit of Carneiro (1970), we propose that the more environmentally circumscribed a territory is, the greater the likelihood of the presence and stability of a state in that area. A territory is environmentally circumscribed whenever there is a large productivity gap between its core and the surrounding periphery, in such a way that it can be difficult to sustain an alternative livelihood in the latter area. This gap plays a crucial role in state formation and stability: it leads to a significant concentration of population in the core, under the control of the state. The high productivity in the core compared to the periphery makes the option of abandoning the core less attractive despite the tax burden, as living in the less productive surrounding areas offers limited economic opportunities. Our work complements the existing literature which has often focused on easily taxable crops (typically cereals), as key determinants in state formation (Scott, 2017; Marshall et al., 2022). We shift the focus towards the population, arguing that access to an “easily taxable population” was equally critical. Additionally, and in contrast to extant papers, we do not study early state formation but early state political stability or consolidation.

We propose a framework of early state consolidation where the individuals of a population can choose between two activities, one that is easily taxable and another one that is difficult to tax. To match our empirical investigation, we interpret this second activity as the possibility of evasion through exiting from the core territory controlled by the state to another territory free from state control.Footnote 1 A key novelty of our approach is that we consider the degree of effective circumscription (or, more broadly, the productivity gap between the taxable and non-taxable activity) to be time-varying, rather than a fixed characteristic. In periods when the area under state control (the core) is substantially more productive than the surrounding territories (the hinterland), circumscription is strong, implying that populations in the area are pulled towards the state in the core. In those periods, the extracting elite can impose high levels of taxation that the population will be forced to accept as exit is not an attractive option. As a result, the tax collection will increase and state capacity will be reinforced. In times when the hinterland instead is relatively productive, an outside option is introduced to the peasant population who will either exit from the state or force a reduction in taxation. Either option will lead to a decrease in the tax collection, which has the potential of destabilizing the state.

Our empirical investigation analyses how asymmetric historical weather shocks in both the core and the hinterland led to changing levels in effective circumscription that, in turn, affected political instability and state capacity in Ancient Egypt during a period of approximately one and a half millenia.Footnote 2 It is widely recognized that the success of the Egyptian state is due to the peculiar geography of the Nile valley (Allen, 1997), characterized by the sharp contrast between the productivity of the Nile banks and that of the surrounding areas. Our choice to focus on ancient Egypt is motivated by three reasons. First, the ancient Egyptian state is characterized by an exceptional level of continuity, which allows us to study its evolution over a long period of time.Footnote 3 Second, the Egyptian civilization has attracted a huge amount of attention from archaeologists and historians and, as a result, historical records (often validated by radio-carbon studies, see below) are much more precise than those pertaining to other civilizations (Shaw, 2000). This allows us to undertake quantitative analysis with a reasonable degree of confidence in our data.

A third reason for choosing to study Ancient Egypt is that changes in the degree of circumscription in Egypt are driven by two largely independent weather systems: the Mediterranean “winter” rainfall and the African monsoon “summer” rainfall in the Ethiopian highlands. The Mediterranean precipitation system provides winter rains to the Nile delta, to the northern parts of the Sinai desert, and to the neighboring lands in Southern Levant. The amount of rainfall in the hinterland surrounding the Nile valley determines to a large extent the type of activities that can be undertaken and the amount of population that can be sustained there. On the other hand, precipitation in the Ethiopian highlands was the main source of the summer floods of the Nile, which in turn determined the productivity of the Nile banks. The extent of these monsoon rains is determined by the latitudinal positioning of the Intertropical Convergence Zone (ITCZ), a weather system in the Indian Ocean. The further north that the ITCZ reaches, the greater the amount of summer rainfall in the Ethiopian highlands and the greater the seasonal floods of the Nile.

It is widely documented that the extent of Nile inundations and the amount of rainfall in the areas surrounding the Nile valley have experienced large variations over time (Said, 1993). This, together with the fact that the above-mentioned two weather systems are basically uncorrelated (see Fig. D.7), makes the Egyptian territory an ideal case for testing a dynamic circumscription hypothesis. As proxy variables for historical Nile inundations and hinterland rainfall, we introduce two paleoclimatological data sources (intertemporal variations in isotopic composition of cave stalagmites) that have not previously featured in econometric analyses of ancient Egypt and which we validate extensively using other standard sources. Thus, our identification strategy exploits the impact of two exogenous and time-varying orthogonal shocks on the degree of effective circumscription in Egypt.

In order to measure the degree of political stability over time, we use historical records on the continuity of the Egyptian state and tenure durations of individual rulers and dynasties. The notion that a longer ruler tenure indicates a higher level of political stability, follows recent work comparing historical developments in the Christian and Muslim worlds (Blaydes & Chaney, 2013). Our state capacity variables capture the fluctuating extent of Egyptian territorial expansion in its hinterland and the intensity of pyramid construction. Our results show that political stability increased in times of high circumscription, that is, in periods of non-extreme Nile floods and dry conditions in the hinterland. The size of both effects is large and significant, which supports the hypothesis that the ability to expropriate output not only depends on producing a large output but also on the ability of retaining the population and force them to accept a high tax.

We argue that our paper contributes to the existing literature by focusing on the relatively unexplored question of state consolidation and survival. More specifically, we claim to make three contributions. First, in order to capture the evolution of early state development in a controlled environment, we introduce new data on paleoclimatic shocks, political instability and state capacity for the Egyptian state covering almost 2 millenia. Second, we test the predictions from our model of dynamic environmental circumscription by exploiting two exogenous time-varying and independent proxies for productivity in the core and in the hinterland. We then show that political stability and strong state capacity are associated with well-behaved Nile floods (i.e., not too excessive or too scarce) and high agricultural productivity in the core, and with dry conditions and relatively low land productivity in the hinterland. Third, unlike the literature at large, we explore two specific mechanisms related to state capacity: Area under control of the state and pyramid construction intensity.

Our work is related to several important strands of research in the literature. Our paper is most strongly related to Schönholzer (2020) who studies the importance of environmental circumscription for the emergence of early states in a global cross-section of grid cells. The key sources of variation is the difference of land quality for agriculture in a particular cell compared to its neighboring cells.Footnote 4 Schönholzer finds cross-sectional evidence that environmental circumscription is an important determinant of state emergence. Our approach instead focuses on the time variation within a single well-documented environment.

Another important recent research is Allen et al. (2023), who exploit random shifts to the course of the Euphrates and the Tigris in order to understand the dynamics of state formation in ancient Mesopotamia. Based on archaeological evidence, the authors find that (city) state formation is more likely to occur following the divergence of a river away from a particular site. On such an occasion, the necessity for coordinated irrigation creates a demand for a state to arise. In contrast, our paper and Schönholzer (2020) might be described as focusing on the rise or consolidation of states as a result of an opportunity to tax a concentrated population.

Scott (2009, 2017) discusses many examples of historical state evasion by marginal populations who prefer to exit from the core rather than being domesticated and dominated by a ruling elite. Further, Mayshar et al. (2017, 2022) emphasize the crucial importance of the transparency and appropriability of the main taxable resource for a strong state to arise. Our conceptual framework integrates these important insights. In addition, our theoretical model shares the assumption of Lagerlöf (2020) that a ruler in an early state can choose between spending tax revenue on own consumption and public goods like defense.Footnote 5 Our work is broadly related to the literature on state and fiscal capacity, pioneered by Besley and Persson (2009) and reviewed by Johnson and Koyama (2017). The key reference on environmental circumscription is Carneiro (1970). Our dynamic extension of the model has some similarities to Olson’s (1993) famous account of “roving vs stationary bandits”, but our interpretation of circumscription is closer to that suggested by Allen (1997), who emphasises the role of circumscription as a social cage in an underpopulated world (Mann, 1986).

A second important tradition is research on the historical impact of climate shocks on human societies.Footnote 6 In recent years, there has further been an explosion of work in the science literature on paleoclimatology that has increased our understanding of how historical weather conditions correlate with serious political instability and state collapse,Footnote 7 However, researchers in this tradition rarely employ standard time series econometrics, as in our study, and neither do they study the interplay between different weather systems in a core and a periphery.

A third central strand of research is the large literature on historical Egypt.Footnote 8 Allen (1997) studies the nature of agricultural production for the rise and consolidation of the Egyptian civilization, as discussed above. Two other highly related contributions on historical Egypt are Chaney (2013) and Manning et al. (2017).Footnote 9 Although our approach shares important similarities to Chaney (2013) in the sense that one of main explanatory variables is a proxy for Nile floods and his outcome variables also measure political stability, our study differs from the existing literature in important ways. Chaney’s (2013) work concerns the Mamluk period in the 12–14th centuries CE whereas Manning et al. (2017) deals with the Ptolemaic Era (332–30 BCE). We contribute by analyzing state development in a much longer and earlier period in Egypt’s history when Egypt was relatively unaffected by neighboring powers. Second, unlike any previous analysis of historical Egypt that we are aware of, in our econometric analysis we rely on paleoclimatic data from natural archives that were unaffected by humans, whereas both Chaney (2013) and Manning rely on historical, man-made records of Nile floods. Third, as far as we know, our paper offers the first test of a dynamic environmental circumscription model that considers the interplay between the intertemporal variation in hydrological conditions prevailing in a core and a hinterland. A recent paper by Sheisha et al. (2022) presents evidence that the construction of the pyramids at Giza during 4th dynasty in Ancient Egypt was assisted by water transport on a fluvial channel from the Nile, which benefitted from relatively high water levels during the period.

The paper is structured as follows. In Sect. 2, we describe our conceptual framework and lay out the key testable hypotheses. A formal model capturing these ideas is presented in Appendix A.1. Section 3 gives a historical background, and Sect. 4 describes the main variables employed in the empirical analysis. Section 5 describes the empirical strategy and the empirical specifications while Sect. 6 presents our empirical results. Section 7 concludes the paper. Additional materials, including a literature overview, a full mathematical model, and data validation exercises, are presented in Appendices A to C.

2 Early state consolidation: a conceptual framework

In this section, we outline an intuitive conceptual framework that is intended to give structure to the empirical research design below. For a complete formal model along the same lines, please see Appendix A.1. Our framework is inspired by the work of Carneiro (1970) and Allen (1997), and shares some features with Schönholzer (2020). These papers analyze how the degree of environmental circumscription in a core and hinterland affects the probability of state formation. However, unlike these papers, we focus on the impact of the intertemporal variation in levels of circumscription on the stability and the capacity of the state. In this sense, our emphasis is placed on state consolidation rather than on original state formation.

We start by assuming a given territory with a highly productive core (c) and a less productive hinterland (h). In the case of Egypt, the core might be thought of as the fertile lands directly adjacent to the Nile whereas the hinterland are the less fertile areas to the east and west. A state has been formed in the core and the distinctive feature of this early state is that its ruling elite (the pharaoh and his enablers) have the capacity to levy taxes on the sedentary farming population residing in the core. The ruling elite also has a capacity to use these tax resources to produce large-scale monuments and defensive fortifications (see below). Taxes cannot be levied on the population in the less accessible hinterland and the elite only has a limited capacity to constrain people from evading the state by exiting to the hinterland.

There is a total population of fixed size that at time t either lives in the core or in the hinterland. For simplicity, we consider that there are just two time periods, \(t=\{1,2\}\) and the world ends after period 2. Although artificial, this two-period approach allows us to highlight the main points of interest avoiding unnecessary complications. In the core, agricultural output is produced, whereas the hinterland is primarily suitable for activities like pastoralism and hunting game. We denote the time-varying (average) land productivity in the core as \(A_{t}^{c}\) and (average) productivity in the hinterland as \(A_{t}^{h}\), where \(A_{t}^{c}>A_{t}^{h}\) always holds for \(t=\{1,2\}\). Average productivity in both regions depends on annual weather variability so that \(A_{t}^{j}\) will be high (low) whenever weather in t is good (bad) in \(j=\{c,h\}\). In Egypt, good weather in the core might be thought of as abundant but not extreme Nile floods that inundated the irrigated fields. Good weather in the hinterland might be thought of as abundant local rainfall that increased biodiversity and the vegetational cover, making it an attractive location for acitivies such as hunting and nomadism. Weather variability within an area is typically autocorrelated but we assume that variations between the core and in the hinterland are independent and mostly uncorrelated, since they are the outcome of different weather systems. We will discuss these assumptions of our framework in some detail in the empirical section below.

The key feature of our dynamic environmental circumscription model is the interaction between the levels of \(A_{t}^{c}\) and \(A_{t}^{h}\). The degree of environmental circumscription at time t is simply defined by the comparative level of \(A_{t}^{c}/A_{t}^{h}(>1)\). The larger the value of this ratio, the higher the degree of environmental circumscription and vice versa. Thus, all else equal, we propose that if \(A_{1}^{c}\) is high, a greater share of the population will be attracted to be settled in the core in the second period, as weather shocks are autocorrelated.Footnote 10 Conversely, when productivity is relatively high in the hinterland, a lower share of the total population will choose to reside in the core in period 2 and instead relocate to the hinterland.

But why would anyone choose to live in the hinterland when conditions for producing food are always better in the core? The main reason, which is emphasized for instance by Carneiro (1970) and Scott (2009), is the fact that the core is dominated by an elite that extracts taxes from the population there. Such taxes might take the form of corvée labor for public projects or direct taxes on agricultural production. The nature and intensity of taxation typically depended critically on the transparency and appropriability of agricultural output, as emphasized by Mayshar et al. (2017, 2022). In our model of early state development, we simply assume that the elite has the capacity to tax the population in the core with a fixed proportion but not in the hinterland. Hence, due to taxation, some parts of the population with a low marginal product might be better off in the hinterland than in the core. It follows that the larger the productivity gap between the core and the hinterland at time 1, the greater the share of the population that will be attracted to the core, and vice versa. At each point in time, there is a part of the population that actively considers whether it is best to remain in the core or to exit to the hinterland.

The assumption above of an exit option in the hinterland is true to Carneiro’s (1970) original model. We recognize however that also another scenario might have existed in early state development: That in times of deficient floods in core and relative good conditions in the hinterland, some marginal populations might remain based in the core but actively attempt to evade the state by subsisting on non-transparent or non-taxable food stuffs and by sometimes taking temporary refuge in the hinterland from the government’s tax officers (Scott, 2009). When Nile levels are good and the hinterland very dry, such state evasion is less likely.

We assume that the total amount of taxes collected by the elite depends on two factors: The size of the population producing crops and paying taxes in the core and the weather shocks that determine agricultural productivity in the core. Positive weather shocks at time 1 in the core increase tax revenues in two ways: Directly, by immediately increasing the level of taxable agricultural yields in time 1, and indirectly, by attracting a greater population to the core that accept to pay taxes. The latter, indirect mechanism is likely to act more slowly than the immediate productivity improvement. Thus, to capture this delayed effect, we assume that population displacement or revolts happen in the second period. Beneficial weather shocks in the hinterland at \(t=1\) will, ceteris paribus, decrease the tax-paying population in the core at \(t=2\) and, thus, decrease total taxes collected in the second period.

Total tax revenues collected from farmers are used by the ruler for constructing public monuments such as pyramids and for consumption of other goods such as festivals and luxury trade items. In an extended model in Appendix A, we differentiate between elite consumption and investments in coercive/defensive capacity. Here, we simply assume that both types of goods increase state capacity and hence makes it more likely for the ruler and his dynasty to remain in power. Conversely, if total tax revenues fall, state capacity will weaken and increase the risk of internal insurrection or an external attack that might dethrone the ruler and his dynasty.

There is ample support in the literature that economic downturns (that in our framework translate into a reduction in the tax collection) increase the probability of conflict and political instability (Acemoglu & Robinson 2006, Burke & Leigh 2010, Chaney, 2013, etc.). Periods of economic stress facilitate the organization of internal revolts, as factions of the elite can take advantage of the weakness of the state (and thus, the smaller size of mobilized defensive resources) and the discontent in the population (due to scarcity and famine) to organize an insurrection and replace the leader. This is the so-called “opportunity cost” effect. An alternative scenario would be one in which the aggressor is primarily interested in robbing the state, in which case the probability of attack and the amount of collected taxes will be positively related (the so-called “rapacity” effect, see Dube & Vargas, 2013).

In the case of Egypt, we argue that a negative correlation between total tax collection and the probability of attack is the most relevant situation.Footnote 11 The pharaoh was believed to be responsible for the Nile floods. Then, in case of famines derived from extreme Nile floods, he was often directly blamed for them (Bell, 1971). This could have led to internal revolts, whereby factions of the elite could have taken advantage of the discontent in the population and/or of the weakness of the army to organize an insurrection to depose the ruler. In addition, the classical period in ancient Egypt offers an object of study that is as close to an isolated and undisturbed “historical lab” as one can find, where the lack of neighboring powers that could pose a serious threat to the stability of the Egyptian state was the general rule.Footnote 12

The proposed causal mechanism from the framework above is summarized in Fig. 1.

Fig. 1
figure 1

Hypothesized causal relationships. Note: The figure is a summary of the hypothesized causal relationships, from left to right, in the theoretical framework above. Factors in blue are unobserved empirically

3 Background on ancient Egypt

This section provides a brief overview of some of the key features of the history, economy, and political organization in the classical period of Ancient Egypt, which is the period that we study in the empirical section. These features also play a central role in our conceptual framework. See Appendix B for a more extensive account.

3.1 Historical overview

The practice of agriculture was widespread in Egypt around 4000 BCE. Shortly afterwards, around 3600 BCE, the Upper Egypt proto-state emerged. Only in Egypt the formation of the state followed so quickly the adoption of farming. It seems that the process of state formation was facilitated by climate change: the Neolithic wet phase, a period unusually wet in northern Africa was coming to an end, which implied migrations from the areas surrounding the Nile valley to the valley itself. Figure B.1 in Appendix B.1 shows the aridification process of the areas around the Nile valley from 6000 to 3600 BCE approximately, as reflected by our proxy for rainfall described in Sect. 4. It shows that the creation of the Egyptian state coincides with the driest spell in the last 8000 years.

It is generally believed that Egypt was unified into a full state by King Menes around the year 3000 BCE. Figure B.2 in Appendix B shows a basic chronology of the main historical periods, relying primarily on dates provided by Shaw (2000). The Early Dynastic Period (3000–2686 BCE), with an undetermined chronology of rulers, was superseded by the Old Kingdom-era (2686–2160 BCE), which hosted dynasties 3–8 and a more precise dating of ruler tenure periods. Due to this improved data availability, our study period begins with the onset of the Old Kingdom in 2686 BCE. Construction activity reached its peak during the 4th dynasty (2613–2494 BCE) when enormous pyramids were erected at Giza to celebrate the reigns of different pharaohs.

After the reign of the 6th dynasty, a period of decline set in, culminating in the First Intermediate Period (henceforth IMP I) in 2160–2055 BCE. During intermediate periods, centralized political power broke down and several competing and short-lasting regional or external rulers fought for control of portions of the Nile Valley. Egypt was however once again unified during the Middle Kingdom (2055–1650 BCE) during which there was a resurgence in monumental building activity, military expeditions abroad and mining in the Sinai. Also the Middle Kingdom was punctuated by a period of decline in the Second Intermediate Period (IMP II) in 1650–1550 BCE, characterized by a divided Egypt with the people known as the Hyksos holding power in the north, Egyptian rule at Thebes in the center of the country, and Nubians ruling in the south (Bourriau, 2003).

A final high point in the classical period of Ancient Egypt was the New Kingdom (1550–1069 BCE) when Egypt was a local superpower and extended its direct influence far into the Near East. Also this era ended with decline and foreign invasions at the onset of the last millennium BCE in the Third Intermediate Period (IMP III, 1069–664 BCE). Our extended study period ends in 750 BCE.

3.2 Weather shocks and agricultural production

The cornerstone of the ancient Egyptian civilization was its highly efficient system of artificial irrigation agriculture. The system of irrigation required a coordinated labor force and available evidence suggests that this coordination was primarily organized on a local rather than on a centralized level (Butzer, 1976; Said, 1993). The rhythm of this kind of agriculture followed a distinct and highly predictable annual pattern. The Nile floods typically started to appear in July, reaching a peak in September, and then receded in October. The inundated basins became fertile fields where farmers first plowed the soil and then sowed crops like barley, wheat and flax. The crops were harvested in February and were often stored in communal granaries. Although the artificial irrigation agriculture in ancient Egypt was highly successful during most years, it was also highly sensitive to disruption. Abnornally high floods might seriously damage the infrastructure of dikes and canals, whereas very low inundations might make fields too dry for cultivation.

The territory of Ancient Egypt was affected by two different precipitation regimes; Mediterranean “winter” rainfall and African monsoon “summer” rainfall in the Ethiopian highlands. The Mediterranean precipitation system provided winter rains to the Nile delta, to the northern parts of the Sinai desert, and to the neighboring lands in the Southern Levant. These rains on their own were not sufficient to allow for sedentary agriculture in northern Egypt but implied that both the Levant and the Sinai were more fertile with a higher population density. In times of relatively abundant rains, an ecological land bridge between Egypt and the Levant emerged that allowed migration between the two areas and economic activity in the Sinai hinterland.

Precipitation in the Ethiopian highlands is the main source of the summer floods of the Nile. During most of the year, the Nile in Egypt (Main Nile) gets most of its water from the White Nile, draining large areas of Equatorial Africa. Right before the summer floods start in June, the White Nile contributes with about 75 percent of the Main Nile waters. Then during the summer floods in July–October, almost 100% of the Main Nile’s water originates from the Blue Nile (Hassan, 1981). These floods are, in turn, derived from the monsoon precipitation that fall on the Blue Nile’s drainage area in the Ethiopian highlands, as shown in Fig. 2. The extent of these monsoon rains are determined by the latitudinal positioning of the Intertropical Convergence Zone (ITCZ), a weather system in the Indian Ocean. The further north that the ITCZ reaches, the greater the amount of summer rainfall in the Ethiopian highlands and the greater the seasonal floods of the Nile (see also the presentation of our climate proxies below).

Fig. 2
figure 2

Intertropical convergence zone (ITCZ). Note: This figure shows the location of Qunf cave, the Nile catchment area in the Ethiopian Highlands, the Mediterranean belt of winter precipitation, and the typical reach of the Intertropical Convergence Zone (ITCZ) during summer months

Agricultural output relied heavily on the extent of Nile inundations. Deviations from optimal inundation levels had a severe impact on land productivity. Aberrant river levels could occur for two reasons. On the one hand, when inundations were insufficient, the land lacked the much-needed water and nutrients. On the other hand, excessive floods could damage the complex irrigation networks that were essential to ensure efficient distribution of the Nile waters.

There are many historical accounts of how periods of aberrant Nile inundations led to famines, social unrest, and political instability (Chaney, 2013; Manning et al., 2017). It is generally believed that deficient Nile floods contributed significantly to the social disorder during IMP I–II (Shaw, 2000). See Appendix B for a more detailed account of existing evidence.

An important and well-known characteristic of Nile river inundations (which is also shared by other climate-related variables) is the fact that they are very persistent. More specifically, Nile inundations are characterized by long cycles where prolonged periods of abundant floods are followed by lengthy periods of deficient floods (Hurst, 1951). This highly persistent correlation pattern is important in our argument. A single year of aberrant floods possibly would not be enough to severely weaken state capacity and/or trigger migration pressures. However, the existence of extended periods where copious or scarce floods tend to cluster together, could surely have a big impact on the viability and consolidation of the Egyptian state.

3.3 State organization

As was discussed above, the surpluses from irrigation agriculture were sufficient to feed a non-producing elite and their servants in Ancient Egypt. The harvested cereals might in turn be stored, saved or redistributed by powerful agents. The emergence of distinct social hierarchies in pre-dynastic times and the subsequent rise of a state, was most likely the indirect consequence of a forced redistribution of surplus food resources from the farming population to a ruling elite. As argued by Scott (2017) and Mayshar et al. (2022), the Egyptian environment with fairly predictable and easily observable Nile floods, combined with the superior storability, transferability, and appropriability of cereals, explains to a large extent why Egypt and Mesopotamia could develop highly stratified societies. There are many indications that a system of taxation was present in Egypt soon after unification.

The ruling elite had two main instruments at their disposal in their pursuit of taxes; the measurement of Nile floods and a writing system to keep records. Nile floods were carefully monitored and recorded from early times with the help of “nilometers” around the country that measured the level of the floods in July–October. With this tool, tax collectors could easily approximate the level of available harvests and meter out an appropriate level of taxation from the farming villages along the Nile (Said, 1993; Mayshar et al., 2017). In addition, biannual inventories (ipet) of government resources such as cattle, fields and prisoners of war were conducted and recorded. Taxes or tributes to the pharaoh were then delivered by regional governors to officials of the royal court (Haring, 2010), possibly during royal inspection voyages along the Nile. It appears to have been a standard principle that no rent or tax should be paid on land that received no flood (Eyre, 2010).

From early dynastic times, the ruler had access to corvée labor who worked at centrally organized production centers where crops, textiles, wine and other necessities were produced by a specialized work force (Moreno Garcia, 2020). The accumulation of tax resources in granaries and temples controlled by the ruler, also served as a natural source of attraction to bandits and nomadic peoples in the hinterland that were not integrated in Egyptian society. Such considerations eventually necessitated the permanent presence of a military organization taking orders from the pharaoh. Corvée labor was also used for the construction of the many massive public monuments that were erected to honor the gods and glorify the reigns of pharaohs. Most likely, this labor was conscripted during the flooding season in July–October when no work was possible on the fields. It is not clear exactly on what basis the rulers managed to extract these labor services from the population (Eyre, 2010), but most of the evidence seems to suggest that the laborers at the pyramids, temples, quarries, forts, etc were conscripted regionally in labor gangs on a temporary or rotating basis (Lehner, 1997; Mumford, 2010).

A key assumption in the theoretical section is that a certain share of the population was mobile and repeatedly considered a choice between staying in the core, which also implied paying taxes to the ruler, or to exit to the hinterland where there was no taxation and where people subsisted on less productive non-farming activities such as hunting and nomadism. It is well known that there were frequent contacts between Egypt and the Levant during dynastic times. Redford (1992), for instance, refers to how pottery in the southern Levant was clearly influenced by Egyptian culture, possibly through migration. When environmental conditions allowed, population movement along the coastal route from the western Delta to Gaza appear to have been common, and Egyptians carried out maritime trade from early times with Byblos on the Lebanese coast. Priglinger (2019) discuss different kinds of evidence of mobility and migration between Egypt and the Near East, particularly during the Middle Kingdom. At times, large contingents of “Asiatics”, i.e., people descending from the Levant or other parts of the Near East, settled within Egypt. The biblical narrative about a Jewish immigration into Egypt during a famine and subsequent exodus from Egypt, offers a well-known myth of an out-migration from the core to the hinterland, although several scholars do not believe this myth has any basis in historical facts (c.f. Redford, 1992).

Moreno Garcia (2020) discusses the behavior of non-sedentary groups in ancient Egypt. He notes that seasonal migrations were common and that nomadic groups interacted with sedentary farmers through production and trade networks where temples other royal centers of power were key nodes. In this system, cities played a modest role, compared to their central role in southern Mesopotamia. Furthermore, archaeological evidence of such highly mobile activities are always limited. Genetic evidence from mummified pharaohs suggest that the population of ancient Egypt was genetically proximate to populations in the Near East (Schuenemann, 2017). In our view, this evidence is well in line with archaeological evidence suggesting a recurring flow of people in the corridor between Egypt and the Levant in periods when the eastern hinterland was relatively fertile.

4 Data and variables

In this section we describe the main dependent and independent variables employed in our empirical exercise: the political stability and state capacity outcome variables on the one hand, and the Nile flood data and the local rainfall variables, which provide the source of exogenous variation in our empirical analysis, on the other. A description of other variables employed in the analysis is provided in Sect. 5. More detailed definitions of all variables, some preliminary analysis as well a table of summary statistics are provided in Appendix C.

4.1 Political instability

Our main dependent variable aims to capture political instability over the period 2686–760 BCE, which covers the Old, Middle, and New Kingdoms, as well as the first and second intermediate periods (see Fig. B.2 in Appendix B for a summary of the Egyptian chronology). Consistent with the theory presented in Sect. 2 we consider that a period is politically stable if there exists a single king ruling over a centralised state. In line with this interpretation, a period is considered to be politically unstable whenever there are two or more rulers in a period, either due to ruler replacement or because there is no central rule due to political fragmentation and several kings are ruling simultaneously. Periods of absence of centralised rule were relatively infrequent in Ancient Egypt and, as documented by Shaw (2000), they are characterized by many short-lasting local kings co-existing simultaneously over the Nile (see Sect. 3.1 for a more detailed description).Footnote 13

We code our main dependent variable, pol instability, as follows. First, we divide the sample period (2685–760 BCE) into 5-year intervals,Footnote 14 Second, we assign a value equal to 1 if there is at least one ruler replacement over the 5-year interval (which happens in 20.6% of the periods), and a value equal to 2 if it is a period of no central rule (18% of the observations). The remaining periods (61.4%) are considered as politically “stable” and are coded as 0. The choice of these values aims to reflect the fact that the breakdown of the centralised power is a more severe instance of political instability than the replacement of a ruler. For the sake of robustness, variations of this basic definition, including a binary version of it, have also been considered, see Table 2.

To code ruler replacements, we use the ruler tenure lists from Shaw (2000), which is generally considered as providing the most accurate chronology for the tenure regimes of Egyptian kings and pharaohs. By carbon-dating museum samples of artefacts associated with particular rulers, Bronk Ramsey et al. (2010) constructed a chronology of ruler periods during 2650–1100 BCE, i.e., during the classical period we study in the paper. When alternative ruler tenure lists were compared, Bronk Ramsey et al. (2010) found that their radiocarbon dates were in strong agreement with Shaw (2000) and referred to it as the ‘consensus chronology’.Footnote 15

The variable pol instability most likely will be able to capture instances of high political instability leading to pharaoh replacement and/or to the breakdown of centralized rule. In spite of this, it also presents important limitations regarding both its measurement and its interpretation. Firstly, although Shaw’s tenure list is widely recognized as the most accurate chronology, there are still a number of dating issues (see Appendix B.4 for a summary) that imply that the exact annual dates provided by Shaw need to be taken with caution. For this reason and as a way of mitigating measurement error, we have decided to work with 5-year periods rather than with annual observations. For the sake of robustness alternative period lengths have been considered obtaining similar results (see Table 2).

Secondly, it is clear that ruler changes is a crude proxy for political instability. It surely happened that an appointed ruler during a period of relative stability might pass away after a short tenure due to natural, random health reasons or because the person was elected at an old age. Such a short duration might then spuriously be coded as reflecting political instability in the country.

To address this limitation we consider an alternative indicator of political instability which only codes the most severe cases of ruler replacement: those that also involved the change of the whole dynasty to which the ruler belonged to. The replacement of a dynasty in Egypt was frequently the outcome of political turmoil and was typically associated with major political change, often reflected in new agendas regarding foreign policy or in the construction of domestic public monuments (Wilkinson, 2010). Appendices B.4 and C provide additional details about the definition of dynasties and dating issues. We define a new variable, (pol instability (dynasties)), in a similar way as before: We divide our sample period into 5-year intervals and assign a value equal to 1 if there is a dynastic change in any of the years of the period (around 3% of the observations in the sample); a value of 2 in periods with no centralized power where at least 2 or more dynasties coexisted along the Nile river (12% of the observations), and a value of 0, otherwise (84% of the observations).

Given the discrete nature of these variables, in the empirical section we will employ non-linear specifications (ordered logit and logit) estimated by maximum likelihood. For robustness, we will also consider linear ones (see Sect. 5.2 for details).

4.2 State capacity

As a proxy for state capacity, we use a variable measuring the geographical extent of the Egyptian state. We use data from Geacron (2017) on the geographical extent of the Egyptian state as a proxy for such military capability (see Fig. B.3).Footnote 16

As our second proxy for state capacity we consider public monument construction. The largest and best-known type of monumental architecture in Ancient Egypt were the pyramids. Lehner (1997) offers an exhaustive account of all known ancient pyramids in Egypt. The first major pyramid was constructed at Saqqara around 2650 BCE by Djoser during the 3rd dynasty. Pyramid construction reached its historical peak during a few decades in the 2500 s BCE with Khufu’s Great Pyramid at Giza, erected around 2570 BCE. Khufu’s limestone pyramid had a height of 146 m, a volume 2.6 million cubic meters, and its construction demanded a work force of roughly 20,000–30,000 men at a time during at least 20 years. After the 4th dynasty, pyramids were not nearly as large. The last really large pyramid was built around 1810 BCE (Lehner, 1997). After the IMP II, pyramid construction ceased. Appendix B.6 provides additional historical details on pyramid construction, as well as a plot of Lehner’s data.

4.3 Land productivity and weather shocks

To proxy our main independent variables, i.e., the productivities in the core and in the hinterland, \(A_{t}^{c}\) and \(A_{t}^{h}\), we use two sources of data on intertemporal variation in rainfall.

4.3.1 Productivity in the core

Productivity in the core, \(A_{t}^{c}\), is a non-linear function of the extent of Nile floods, as discussed in Sect. 3.2. What would be an ideal direct measure of Nile inundations? Egyptians have recorded Nile floods over hundreds of years with the help of Nilometers (such as the Roda Island Nilometer, used by Chaney, 2013; Manning et al., 2017). Unfortunately, reliable Nilometer records for ancient Egypt do not exist.

Instead, we use an indirect proxy of Nile inundation levels. The source of the Blue Nile (and of the smaller Atbara River) - the “pacemaker” of the late summer Nile floods in Upper and Lower Egypt - lies close to Lake Tana in the Ethiopian Highlands. Figure 2 provides an overview of the geography of the area. Precipitation in this area is a result of the African Monsoon weather system and, more specifically, the location of the Intertropical Convergence Zone (ITCZ). When this zone moves far north, there is more monsoon rain in the Ethiopian Highlands, and vice versa (Said, 1993). The ideal “paleoclimate archive” should thus be located in this region and have observations with a high time resolution far back in time.

The closest natural archive to the Ethiopian Highlands that is subject to similar patterns of monsoon precipitation and that has good resolution, is found in Qunf cave in Southern Oman (Fleitmann et al., 2003). Southern Oman is exposed to the same patterns of monsoon precipitation as the Ethiopian Highlands, as reflected in Fig. 2. The time series at Qunf is derived from speleothem data, measuring the oxygen isotope content (\(\delta \)18O) in a stalagmite from the cave. In the scientific literature, \(\delta \)18O-levels have been frequently used as proxies for historical rainfall. As explained by Fleitmann et al. (2003, 2007), the data from Qunf cave should provide a reliable indicator for monsoonal precipitation in the northwestern Indian Ocean during the period.

It contains 1412 observations from the time period 8608 BCE–1650 CE with a long break in 760 BCE–638 CE. The time resolution of observations (i.e., years between observations) during the most relevant period 3000–1000 BCE is 1–14 years with an average of 3.85 years. Since the \(\delta \)18O-numbers do not have a natural interpretation in our setting, we create a Nile flood index, normalized to the range [0, 10], where higher numbers indicate more monsoon rains, which should, in turn, imply higher levels of Nile inundations. We interpolate for missing years and also calculate 5-year averages for the study period in the analysis below and apply a Butterworth filter in order to smooth out the influence of extreme individual years (see Sect. 5.2 below for a discussion about the filters employed). Appendix C contains further details about the construction of this variable. The smoothed series is shown as the thick blue line below in Fig. 5.

We have carried out an extensive validation of these data as a proxy for Nile inundations (see Appendix D for the full coverage). Figure 3 below summarizes the info from four of those tests. The first one in (a) shows the binscatter correlation between a proxy for lake levels in Lake Tana in the Ethiopian highlands (titanium concentration (Ti) from a sediment core) with our preferred Nile floods proxy (Qunf) over more than 10,000 years. As the graph indicates, the correlation is positive and quite strong except at very high Qunf levels. The second graph (b) shows the correlation with accumulated sediment levels in the Nile delta. This correlation is also positive and very strong, as is the case with sediment levels further out in the Mediterranean in (c). Figure (d) shows the association between about a thousand available Nilometer measures from 622 to 1650 CE and our Qunf data. The correlation is clearly positive but the fit is not as tight as in the other graphs.

Fig. 3
figure 3

Binscatter plots between Nile floods and alternative proxies. Note: The figure shows the fitted binned scatter plot relationship between our Nile floods (Qunf) index and various other climate indicators in the same order as specified in Table D.3

4.3.2 Productivity in the hinterland

To proxy productivity in the hinterland, we use rainfall in the area surrounding the Nile Valley. During wet periods, Mediterranean winter rainfall levels implied a relatively green and habitable hinterland between Egypt and the Southern Levant whereas dry periods gave rise to desertification. Rainy years should also have been associated with greater human activity in the hinterland drylands in general, including mining and pastoralism. To our knowledge, there is no detailed climate archive available from the Sinai, which would be the ideal location for such data. As a proxy for Mediterranean winter rainfall in northern Sinai, we use speleothem data on the \(\delta \)18O isotope in stalagmites from the Soreq cave, about 18 km west of Jerusalem and 60 km northeast of Gaza in contemporary Israel (Bar-Matthews et al., 2003; Bar-Matthews & Ayalon, 2011). Throughout history, Gaza has been the Levantine outpost towards Sinai and Egypt.

The data collected from this cave spans a very long period of 200,000 years and has been frequently used in the science literature as a reliable benchmark series. The average time resolution is not as fine for the 2685–760 BCE as in the Qunf cave and has an average of 28.1 years with a maximum of 100. For the period after 1140 BCE, the resolution is particularly coarse and we therefore focus on the period 2686–1140 BCE in the time series regressions below. For missing years, we conduct an identical linear interpolation procedure as with the Qunf-data and then calculate 5-year averages levels and smooth the resulting series with a Butterworth filter. We also create a similar rainfall index as above, normalized to the range [0, 10].

As opposed to the previous case, hinterland productivity and rainfall are likely to be linked in a monotonic way since this area is particularly dry and rainfall is rarely excessive. Figure 4 provides a validation exercise and shows that there is indeed a strong correspondence between our proxy of rainfall in the hinterland and other available measures for this area. The graph in (a) validates our measure by comparing actual annual temperature outside the Soreq cave with isotope data from the stalagmite inside the cave. Panel (b) shows that rainfall during the last 100 years in Israel has a positive association with the much lower levels of precipitation in Egypt. Panel (c) compares two different isotopic measurements from Soreq cave which display a clear negative association, as expected, since lower values on the vertical axis are associated with more vegetation near the cave, which is another correlate of historical precipitation. Lastly, graph (d) compares our Soreq rainfall indicator with a measure for dust intensity in sediment layers in the Mediterranean, serving as a proxy for sandstorms during very dry years. Further details about this validation exercise are presented in the Appendix D.

Fig. 4
figure 4

Rain hinterland versus alternative proxies. Notes: a shows the scatterplot between our Rain hinterland (Soreq) index and observed rainfall outside Soreq cave for available annual observations 2000–2010, whereas Figures in bd show the binned scatterplot between rain hinterland and variables in the same order as specified in Table D.4

An assumption in our theoretical model is that the weather systems affecting the core and the hinterland are uncorrelated. The binned scatter plot in Fig. D.7 in Appendix D.2, shows that the correlation of our Nile floods and Rainfall hinterland-variables for the time period of our study, is indeed close to zero.

Figure 5 summarizes the essence of our empirical analysis. The thick blue line shows Nile floods as proxied by the smoothed Qunf time series, normalized to range between 0 and 10 over the longer term, whereas the green thin line shows the Rainfall hinterland time series using data from Soreq (likewise normalized to 0–10). Also indicated are years of ruler changes, marked as red triangles, and years of dynastic turnover, marked by black circles and a label of the dynasty in question.

Fig. 5
figure 5

Climate proxies: Nile floods and rain hinterland. Note: The figure shows our normalized filtered proxies of Nile floods (Qunf data, thick blue line) and Rainfall hinterland (Soreq data, green line) on the vertical axis and years BCE on the horizontal axis. Ruler changes are indicated with red triangles and dynastic changes with black circles, accompanied by the number of the dynasty that assumed power. Also shown as grey fields are two intermediate periods (IMP I and IMP II). These periods are displayed as extended disorderly periods when no chronology of rulers is available in Shaw (2000)

Our model suggests that political instability should be most severe during extreme Nile levels, i.e., either during peaks or troughs in the blue line, and when rain in hinterland is relatively high. Visual inspection suggests that this is indeed the case. For instance, a number of ruler changes, as well as the appearance of the 5th dynasty, happened during an extreme drought episode around 2500 BCE when rainfall in the hinterland was relatively high. In the high flood-episode a few decades later, there were no less than five ruler changes, whereas the 5th dynasty was eventually replaced by the 6th during the extreme drought of the 2340s BCE. After the peak of floods in 2180 BCE, the united Old Kingdom collapsed and competing dynasties 7–10 fought over state control. In the Middle Kingdom, the 12th dynasty assumed power in an extreme high flood-year in 1980 BCE. There were also a flurry of ruler changes in the drought years of the 1800s BCE in the run-up to the big disorder of the IMP II from 1775. It is further noteworthy that the two intermediate periods IMP I–II happened shortly after peaks in hinterland rainfall when state destabilizing “push conditions” should have been relatively strong. In addition, both the Nile and the Rain hinterland time series reach a local minimum during the Hyksos invasion in the IMP II.

5 Political instability in ancient Egypt: empirics

This section describes the sample period, the key features of our empirical strategy, the empirical specifications and estimation techniques employed and a brief preliminary analysis of the data.

5.1 Sample

Our empirical analysis examines the evolution of political instability and government capacity in ancient Egypt over the period 2685–1140 BCE, which covers the Old, Middle, and New Kingdoms, as well as the IMP I and II intermediate periods (See Sect. 3.1 for details about the chronology of Ancient Egypt). The unit of analysis in our baseline regressions is 5 year-periods. The reason why we aggregate the data in such a way is twofold. Firstly, the type of mechanisms we have in mind (i.e., state weakening and increasing migrating pressures) are unlikely to follow after a single year of low agricultural productivity. Instead, they are more likely to arise in periods with repeated low harvests. We capture this by considering average Nile behaviour in a given time period (5 years in our baseline specification, but other frequencies have also been considered, see Table 2). Secondly, given the antiquity of our data the precision in the dating of our variables is an important concern. By considering five- (or, in some regressions, ten-) year periods we can reduce, at least partly, measurement error due to inexact dating. For instance, despite the fact that Shaw’s tenure list is widely recognized as the most accurate chronology, a number of dating issues remain (see Appendix B.4 for a summary), which makes the yearly coding of the dependent variables prone to measurement error. Thus, by considering 5- (or 10-) year periods and defining the dependent variable as the maximum value of any annual ruler change in that period, we increase the probability of capturing actual ruler/dynasty changes within the period. As a result, measurement error is reduced with respect to a yearly specification, as the 5- or 10-year specification are less demanding in terms of exact dating. This, of course, would not eliminate entirely measurement error but we hope that at least it mitigates it to some extent.Footnote 17

5.2 Empirical strategy, estimation and model selection

Due to the difficulties in obtaining historical time-varying data on early state formation and consolidation, a good part of the literature investigating early state formation is forced to adopt a cross-sectional approach.Footnote 18 This approach assumes that after conditioning on observables, cross-sectional units are identical and, therefore, all differences in observed conditional outcomes can be attributed to differences in the treatment. A well-known weakness of this approach is its vulnerability to omitted variable bias. Instead, our identification strategy exploits exogeneous time-varying shocks to the productivity of the Nile valley as well as its neighboring areas and investigates their effect on political instability and state capacity. By considering just one unit (Ancient Egypt), our approach eliminates the effect of unobservable differences across units, and thus also eliminates a potential source of omitted variable bias. In spite of this, claiming causality in a time series framework has its own difficulties (see Bojinov & Shepard, 2019; Rambachan & Shepard, 2021 for recent expositions) and, in order to do that, one often needs to impose stringent and unrealistic assumptions. Thus our results should be better interpreted as associations rather than as reflecting causal relationships.

As argued by Hsiang (2016), climate events can have both direct and indirect effects on outcomes. Direct effects refer to the contemporaneous impact of weather conditions, whereas indirect effects allude to the impact of climate on individual’s beliefs which may affect their decisions and, therefore, the resulting outcomes. This is particularly relevant in our case, as our theory involves both direct and indirect effects of climate. For instance, a (sequence of) bad harvests in the core has a direct effect on the tax collection of that period but also an indirect one on future collection, that operates through the decision of farmers to migrate to/from the core in the following period. This decision, in turn, is made based on beliefs about the evolution of climate. These beliefs might need some time to form and have an impact on outcomes. It is widely assumed that agents facing some weather event for long periods of time will update their beliefs about the weather, whereas individuals facing those events during a short period will not alter their beliefs.

In order to capture both direct and indirect effects, papers using climatic time series often filter the data so that the latter captures the low-frequency component (see Zang et al., 2007, Tol Wagner, 2010). We also follow that approach here, and apply a low-pass filter to our climate proxies. More specifically, we apply the Butterworth low-pass filter, see Pollock (2000) for details. Focusing on the low-frequency component of the climate data is important for our purposes as the type of effects we hypothesize (i.e., severe weakening of the state and credible migration threats) are unlikely to follow a single year of aberrant floods but rather, they would be the outcome a sequence of negative weather shocks. This approach also has a downside. As noted by Hsiang (2016), identification can become more problematic since the use of filtered data typically requires the comparison of populations over long periods of time, which can undermine our main identifying assumption. To be able to cope with this threat, some of our specifications contain time trends as well as period dummies.

Our main dependent variables measuring political instability are discrete. For this reason, we employ non-linear specifications (ordered logit and logit) estimated by maximum likelihood and, for robustness, also linear ones. Both types of specifications have pros and cons, so by using both we hope to be able to overcome some. A key characteristic of our data is a high degree of autocorrelation. Linear models provide a more robust and flexible framework to test and correct for autocorrelation in the residual term than non-linear ones. This is an important advantage, as residual autocorrelation can provoke biases in the estimators when dynamic time series models are considered. Also, since our models contain squared terms and sometimes interactions as well, interpretation of results is simpler in linear models (Ai & Norton, 2003). Nevertheless, since our dependent variables are often ordinal indices, OLS is also problematic, as it interprets literally the assigned (arbitrary) values defining the different categories. For this reason, in our baseline specifications we use non-linear models but we also report estimates obtained in linear specifications, which will be helpful in interpreting the results and in testing for residual autocorrelation.

To capture dynamic effects, we incorporate lags of both the dependent and independent variables in the model. Models containing lagged values of the dependent variable should be “dynamically complete” (i.e., the errors should be serially uncorrelated). Otherwise, autocorrelated errors and the lagged dependent variable may become correlated, leading to endogeneity. Testing for residual autocorrelation in non-linear specifications is challenging. To achieve dynamically complete models, we undertake two steps. First, we sequentially introduce lags into the model and test the significance of additional lags, following the approach suggested by Wooldridge (2006, Chapter 15). Second, we re-estimate the resulting model using a linear specification and check if the residuals are serially uncorrelated. We report the Cumby-Huizinga test for first-order autocorrelation for each of these linear models.

We also allow for a number of controls. In the first table we use the Bayesian information criterion (BIC) to choose a model specification for them and, to facilitate comparability across equations, we maintain that specification in all subsequent tables.Footnote 19 More specifically, for each control we consider the variable in levels, its square and 1 lag of these variables, see Table 1 for details.

Table C.2 in Appendix C presents the results of unit root tests applied on our dependent variables. We are able to reject the null hypothesis of a unit root in all of them, except for the case of area, the log of the area under state control. For this reason, in the empirical analysis we model this variable in first differences, see Sect. 6.2.

6 Results

This section describes our results. Section 6.1 presents the main analysis, which focuses on the relationship between weather shocks and political instability. Section 6.2 provides additional evidence by analyzing the relationship between proxies of state capacity (in particular, and the area state control) and weather shocks. Variable definitions and summary statistics are provided in Appendix C.

6.1 Political instability

Tables 1 and 2 present our main results relating political instability and the proxies for the weather conditions in the Nile and surrounding areas. Robust standard error values are reported in brackets except when linear specifications are considered (Column 8), for which HAC standard errors are reported.

Table 1 Political stability and climate shocks

Table 1 focuses on pol instability. Ordered logit models are estimated in all columns except in Column 8, where a linear specification is employed. In addition to estimated coefficients and their standard deviations, Table 1 reports the model BIC and the p-value of the Cumby–Huizinga test of residual autocorrelation (computed on a linear specification of the model). The null hypothesis of this test is the lack of first order autocorrelation in the residuals, so large p-values are associated to uncorrelated residuals.

Column 1 regresses pol instability on nile floods and lags of the dependent variable.Footnote 20 The coefficient of nile floods is negative but not statistically different from zero. Column 2 allows for a non-linear relationship between rulers instability and nile floods by introducing the square of the latter variable.Footnote 21 Allowing for a non-linear relationship delivers estimates that are more precisely estimated, and now both nile floods and its square are significant (at the 10% level). While the former keeps the negative sign, the square term is positive, suggesting that ruler instability is lowest whenever Nile floods were not extreme, either because they were too low, in which case periods of droughts would follow, or too high, as excessive floods could cause destruction of the irrigation infrastructure.Footnote 22

In Column 3, we introduce tenure\(_{t}\), measuring the number of years from the ruler’s first year in office to the first year of the current period. We also add its square, to account for the possibility that ruler replacement becomes more likely as the ruler gets old, and 1 lag of these variables, as dictated by the BIC. These control variables are significant, while the coefficients of the Nile variables are similar as in the previous column, although the estimation is a bit noisier, resulting in slightly larger standard errors.

Column 4 introduces the contemporaneous value of rain hinterland\(_t\), our proxy for rainfall in the areas surrounding the Nile valley. Its associated coefficient is positive, indicating that more rain in the surrounding areas tends to increase political instability, although it’s not significant (p-value is 0.12). Our theory suggests that the main impact of rain hinterland on political instability is through (future) exit pressures, which implies that the main effect will be driven by lagged values of rain hinterland. To examine this prediction, Column 5 replaces rain hinterland\(_t\) by rain hinterland\(_{t-1}\). This leads to a decrease in the BIC, meaning that under this criterion this model is preferred. The coefficient of rain hinterland\(_{t-1}\) is positive and significant which suggests that an improvement of weather conditions in the neighboring areas increased ruler instability although the effect needed some time to become apparent. Column 6 considers rain hinterland\(_{t-2}\) in place of rain hinterland\(_{t-1}\). The coefficient associated to rainfall in the hinterland increases and its significance improves, while the BIC is smaller, which again suggests that using rain hinterland\(_{t-2}\) improves the fit of the model. Column 7 introduces in the regression both rain hinterland\(_{t-1}\) and rain hinterland\(_{t-2}\) simultaneously. Only the latter is significant and, since introducing both variables in the regression increases the BIC considerably, from now on we capture the effect of weather conditions in the hinterland by using rain hinterland\(_{t-2}\). Column 8 re-estimates Column 6 by OLS, showing very similar results.

Column 9 introduces dummies for the different periods in which Egyptian chronology is typically divided (Old, Middle and New Kingdoms as well as the first and second intermediate periods).Footnote 23 As the probability of transition from one period to another is likely to be affected by climate changes, these variables can be considered as “bad controls”. However, by including them in the regression we can check whether the results in Table 1 are driven by the highly unstable first and second intermediate periods or whether they still hold when only within period variation is considered. Introducing period dummies does not change our conclusions. Finally, Column 10 introduces deterministic time trends (a quadratic polynomial of time) to capture systematic trends in the data, obtaining similar results

Summarizing, the models in Table 1 explain a significant fraction of the variance of pol instability (that reaches 75% when linear models are considered), and support our hypothesis that political stability in Ancient Egypt was higher in times of relatively high circumscription, that is, whenever Nile floods were favourable and when the weather conditions in the neighboring areas were rough.

Table 2 considers additional variations to probe into the robustness of the results in Table 1. Ordered logit specifications are employed unless otherwise noted, and the same controls as in our baseline specification (Column 6 in Table 1) are considered in all columns, but the value of the associated coefficients is not reported to save space.

Table 2 Political instability and climate shocks: variations

Column 1 is identical to Column 6 in Table 1 but captures the non-linear relationship of Nile floods and ruler instability by defining two dummy variables, nile extreme flood and nile extreme drought, that are equal to 1 when the corresponding observation of nile floods is in the upper or bottom 5% of the distribution of nile floods, respectively. The coefficient of nile extreme drought is positive and significant while that of nile extreme flood is close to zero and insignificant, suggesting that droughts had a more severe impact on political instability than extreme floods. Column 2 considers whether there is an interaction effect between the two weather shocks. To examine whether the effect of rain hinterland is heterogeneous depending on whether Nile floods are extreme or not, we define a new dummy, nile extreme, that is 1 whenever Nile floods are extremely low or extremely high (i.e., whenever either nile extreme drought or nile extreme flood are equal to 1). Column 2 shows that both nile extreme\(_t\) and rain hinterland\(_{t-2}\) are positive and significant, as expected, but the interaction of the two is negative and also significant. As interpreting interaction terms in non-linear models is not as direct than in linear ones, Column 3 re-estimates Column 2 by OLS, obtaining very similar results. Since extreme is a dummy, a negative and significant value of the interaction term implies that the slope of rain hinterland\(_{t-2}\) is flatter whenever the behavior of the Nile was extreme. In fact, we cannot reject that that slope is equal to zero whenever extreme is one.Footnote 24 This suggests that Nile floods had a first-order effect on ruler instability and that, whenever floods were extreme, instability was more likely to follow, irrespective of the conditions in the hinterland. These conditions, however, became important in “normal” Nile periods (where “normal” here refers to the central 90% of Nile flood values), when the state was not under the effect of such extreme Nile shocks. This result is reasonable since as it is well-known that Egypt’s economy heavily hinged on the extent of Nile floods and whenever those severely failed, famine and unrest followed.

The remaining columns in Table 2 provide additional variations: Column 4 uses an alternative procedure to filter the weather data. In particular, the Baxter King filter is employed in an specification otherwise identical to that in Column 6, Table 1. Column 5 considers a longer time period than in our baseline regressions, more specifically from 2686 to 760 BCE.Footnote 25 Column 6 considers a binary dependent variable, pol instability(dummy), and a logit specification is employed. Column 7 re-estimates Column 6 using a linear specification estimated by OLS. Finally, columns 8 and 9 replicate our baseline specification (Column 6 in Table 1) using different units of analysis. In particular, Columns 8 and 9 consider 10-year and 1-year periods, respectively, as unit of analysis. In general, results in Table 2 provide support to the circumscription hypotheses, i.e., favorable weather conditions in the core, together with rough ones in the hinterland, contributed to higher political stability in Ancient Egypt.

Figure 6 reports the marginal effect of nile floods on pol instability, which has been computed using the coefficients in Column 7 from Table 2 (corresponding, for simplicity, to the case where the dependent variable is binary and the equation has been estimated by OLS). It shows that increases in nile floods have a negative and statistical significant effect on political instability for moderate values of nile floods but the marginal effect becomes positive for large values of nile floods.

Fig. 6
figure 6

Marginal effect of Nile floods on Pol Instability. Notes: This graph plots the marginal effect of nile floods on pol instability (dummy). Estimates correspond to Table 2, Column 7

Tables 1 and 2 show that political instability is associated with reductions in the degree of circumscription in Ancient Egypt, and that this conclusion is robust to a number of variations. A common concern associated with these tables, however, is the fact that rulers can be replaced for reasons that do not necessarily involve social unrest and political instability, such as natural death or disease. To address this concern, Table 3 considers an alternative proxy of political instability, pol instability (dynasties), which only considers those ruler replacements that also led to the removal of a dynasty. More specifically, pol instability (dynasties) is equal to 1 if there is a dynasty replacement in period t and equals 2 in periods where there are two or more dynasties in power, as those periods reflect lack of central rule and a high degree of political instability. Otherwise, it takes a value equal to zero.

Table 3 Political instability (dynasties) and climate shocks

This variable presents some advantages with respect to pol instability, the most important one being that the fall of a ruler and a dynasty is likely to be the consequence of highly turbulent events, and thus, it could be a better proxy for political instability. On the negative side, however, the classification of rulers in dynasties dates back to Manetho’s Aegyptiaca (3rd century BC) and the criteria followed to elaborate that list is not always transparent, as explained in more detail in Appendix B.4. Having this limitation in mind, we have explored whether our climate proxies have any power in predicting dynasty replacement.

Table 3 presents the results, obtained in a set-up otherwise very similar to that in the previous tables. In particular, ordered logit models are estimated unless otherwise stated. Column 1 in Table 3 regresses pol instability(dynasties) on nile floods and lags of the dependent variable. As in previous columns, two lags of the dependent variable were enough to avoid residual autocorrelation. This column shows that there is a not a statistically significant (linear) relationship between the former variables. However, the results change if the square of nile level is added to the former specification. Column 2 shows that too low or too high Nile floods are associated with higher dynastic instability.Footnote 26 Column 3 introduces tenure dynasty, which measures the number of years a dynasty has been in office up to the first year of the current time period.Footnote 27 As in the case of tenure, we also allow for a squared term of tenure dynasty to capture potential nonlinear effects, as well as 1 lag of these variables (whose coefficients are not reported to save space). The BIC decreases after introducing these controls suggesting an improvement in model fit, but otherwise results are unaffected by the introduction of these controls.

Column 4 introduces rain hinterland\(_{t-2}\) in the model.Footnote 28 The coefficient of this variable is positive and statistically significant. Column 5 captures aberrant Nile floods by employing two dummies reflecting extreme floods and droughts. Both variables are positive and significant in the specification, suggesting that serious political instability was more likely in periods of extreme Nile behavior. Column 6 re-estimates Column 4 by OLS in a linear specification. Similar results are found but the coefficient associated with rainfall hinterland\(_{t-2}\) is now smaller and not significantly different from zero. Column 7 introduces additional controls, more specifically it considers tenure and its square, as well as 1 lag of these variables, as in Tables 1 and 2, which doesn’t alter the conclusions from previous columns. Column 8 adds period dummies, which increase a bit the p-values but have little impact on the estimated coefficients. Finally, Columns 9 and 10 employ a bivariate dependent variable that equals 1 whenever there’s a dynasty replacement or whenever there were two or more dynasties in power. Column 9 considers a logit specification and obtains very similar results. Column 10 re-estimates the same model using a linear model. Similar results are also found, except for the coefficient of rain hinterland\(_{t-2}\), which now is smaller and becomes insignificant.

Summarizing, the results in Table 3 show that political stability in Ancient Egypt is associated with high productivity in the Nile valley together with low productivity in the nearby areas. If anything, weather shocks seem to be more strongly linked to pol instability (dynasties) than to the previous variables, suggesting that they are associated with more severe forms of political instability. The following section presents additional evidence supporting our claim that periods of high circumscription helped the consolidation of the state.

6.2 Additional evidence: state capacity

The theory outlined in Sect. 2 (and Appendix A.1) posits that periods of high circumscription are associated with a higher tax collection and a greater state capacity. This section explores the impact of changes in circumscription on our proxies for the latter.

Our main proxy for the strength of the state is the area under state control. We consider the variable area which is the log of the state controlled area (in km\(^2\)).Footnote 29 Although admittedly this variable is likely to be a rough proxy for the true area, Fig. B.3 shows that the area under state control varied widely over time. The largest surface was achieved during the New Kingdom where the state controlled territories up to the current Syria. On the other hand, the minimum area was reached during the intermediate periods, where central authority collapsed. In our view, conquered areas were primarily used for resource extraction and hence served to increase state consumption.

Our main hypothesis is that good harvest years should lead to an increase in the size of the area under the control of the state. In addition, in line with the circumscription hypothesis, wetter conditions in the surrounding areas would weaken the power of the state in the medium-run, leading therefore to a reduction in the area under the state’s control.

Table 4 examines these predictions. As mentioned in subsection C.3, we could not reject the null hypothesis of a unit root in area (see Table C.2 in Appendix C). Then, Table 4 mainly focuses on the first difference of the log of the area under state control, denoted by area(growth), which can be interpreted as the area’s growth rate,Footnote 30 Column 1 regresses area (growth) on nile floods and its square as well as on similar controls as in previous tables.Footnote 31 No lags of the dependent variable have been introduced as residuals do not exhibit autocorrelation and the BIC increases after introducing them. The coefficient of nile floods is positive while that of nile floods\(^2\) is negative, suggesting that area(growth) is positively associated to intermediate levels of Nile floods.

Table 4 Area controlled by the state and climate shocks

Column 2 introduces the contemporaneous value of rainfall in the hinterland. It has a negative and significant coefficient, implying that more rainfall in the areas surrounding the Nile is associated to a decrease in area (growth). Columns 3 and 4 lag 1 and 2 periods the value of rain hinterland. The size of the coefficient tends to increase (moderately) in absolute value with the lag and the value of the BIC decreases, suggesting a similar behavior as in previous tables. Column 5 considers period dummies and the results are similar. Finally, Column 6 regresses the variable in levels, i.e., area on the same variables as Column 4 and a lag of the dependent variable (needed to capture autocorrelation in the residual term), obtaining very similar results.

Our second proxy for state capacity captures the intensity of public monument construction activity in the form of pyramids. Recent research on Ancient Egypt suggests that water transportation of building materials along special channels, derived from the Nile during inundations, were essential for the construction of the largest pyramids at Giza (Sheisha et al., 2022). These findings imply that the construction of at least some pyramids depended on the presence of relatively high flood levels compared to more recent times. A complementary hypothesis is that in periods when the tax revenue was large, more resources could be devoted to pyramid construction. To examine these hypotheses we use data on pyramid construction from Lehner (1997), who provides information on the total volume in cubic meters constructed over the lifetime of a pharaoh. We define the variable pyramids as the log of total pyramid construction over the lifetime of a pharaoh. An obvious drawback of this variable is that we have few data points to implement a proper statistical analysis, as there are only 39 pharaohs up to IMP II. This means a proper statistical analysis cannot be carried out and, therefore, any conclusion drawn from these data has to be taken with caution.

In spite of these limitations, we have regressed pyramids on the average over the lifetime of the pharaoh of nile floods and its square and rain hinterland, controlling for the number of years that the pharaoh was in power. As in previous sections, we find that pyramid construction is largest when Nile floods are not too high or too low (the coefficient of nile floods is positive and that of nile floods\(^2\) is negative, and both are significant at the 10% level).Footnote 32 Panel (a) in Fig. 7 shows the non-linear relationship linking pyramid construction and Nile behavior. This result is in line with Sheisha et al. (2022) who show that pyramid construction at Giza was at a peak during moderate levels of Nile floods.

Fig. 7
figure 7

Binscatter plots between Nile floods and alternative proxies. Note: The figure shows the fitted binned scatter plot relationship between pyramids, our Nile floods index (a) and the rain in the hinterland index (b), where both indices have been averaged over the lifetime of each pharaoh up to the IMP II

However, as opposed to previous analysis, we find that the relationship between rainfall hinterland and pyramids seems to be positive, as Panel (b) in Fig. 7 shows. In the regression above, the coefficient of rain hinterland is positive and significant at the 10% level.Footnote 33 An obvious potential explanation for this is that the materials employed to build the pyramids (limestone, granite and minerals) were often found in the desert surrounding the Nile valley. Thus, favorable contemporaneous weather conditions in those areas could facilitate mining expeditions and, thus, could have a positive effect on pyramid construction.

In summary, the results in this section are broadly consistent with the importance of the circumscription hypothesis as they show that the area under state control and pyramid construction deteriorated in years of aberrant Nile floods (particularly so when periods were abnormally dry). The area under state control also deteriorated in times following wet conditions in the hinterland. The linear positive effects of rainfall in the hinterland on pyramid construction were possibly due to the fact that favorable weather conditions in the surrounding areas could facilitate mining expeditions which provided the key elements for construction.

7 Conclusions

In this paper we have outlined a dynamic model of environmental circumscription in which productivity shocks in a core and a hinterland affect the effective level of circumscription for a farming population. In periods of beneficial weather in the core and poor weather in the hinterland, people are pulled towards the core, implying a greater population to tax, higher revenues and political stability. When the reverse situation holds, people are pushed towards the hinterland or switch to non-transparent food subsistence activities, tax revenues fall, and the ruling elite is more likely to be overthrown.

The predictions from the model are then applied to an empirical investigation of weather shocks and political stability in Ancient Egypt during 2686–1140 BCE. Developing novel paleoclimatic proxies for Nile floods and hinterland rainfall, we show that political instability is highest when floods are either extremely high or extremely low. With a time lag, favorable conditions in the hinterland will cause political instability in the core, as predicted by the model.

In future research, we would like to explore whether our dynamic environmental circumscription model has external validity for other time periods and settings. For instance, it would be interesting to investigate further whether rainfall shocks might explain the sudden disappearance of the Maya culture on the Yucatan peninsula around 900 CE. Is it even possible that lessons from such historical cycles might have some bearing on the risk of political collapse in our contemporary societies? We hope to address these issues in future work.