Elite violence and elite numeracy in Europe from 500 to 1900 CE: roots of the divergence

Our research expands earlier studies on elite human capital by widening the geographic scope and tracing the early roots of the European divergence. We present new evidence of elite numeracy in Europe since the sixth century CE. During the early medieval period, Western Europe had no advantage over the east, but the development of relative violence levels changed this. After implementing an instrumental variable strategy and a battery of robustness tests, we find a substantial relationship between elite numeracy and elite violence, and conclude that violence had a detrimental impact on human capital formation. For example, the disparities in violence between Eastern and Western Europe helped to shape the famous divergence movement via this elite numeracy mechanism and had substantial implications for the economic fortunes of each region over the following centuries.


Introduction
In this study, we assess the joint evolution of elite violence and elite numeracy across Europe over 1400 years (including Asia Minor and the Caucasus). New evidence on elite numeracy is presented for the first time, allowing for the long-term relationship between elite violence and elite numeracy to be examined. This study 1 3 uses a variety of econometric techniques, from panel regressions to spatial methods, first difference regressions to instrumental variable estimation; finding that declines in violence determined growth in elite numeracy in certain European countries since the medieval period, such as in England and the Netherlands. Similarly, higher levels of elite violence corresponded to lower elite numeracy in Eastern and South-Eastern European countries, for example, leading to Europe's famous divergence movement (van Zanden 2009, De Pleijt andvan Zanden 2016;Broadberry 2013).
Additionally, we contribute to a modestly sized but growing literature on elite numeracy. To demonstrate that the upper tail of the knowledge distribution mattered for growth, Squicciarini and Voigtländer (2015) use the example of the industrial revolution in France. Inspired engineers and bold entrepreneurs were able to establish firms using recently developed technologies and subsequently developed various technologies further. Baten and van Zanden (2008) studied advanced human capital using book consumption and drew parallels with growth in the sixteenth century, when several European countries managed to set up growth-promoting institutions due to human capital. This resulted in a system of trading cities and merchants who coordinated world trade as far back as the sixteenth century.
In general, the debate around explanations for the Great Divergence, which saw Western Europe become the world's chief economic force during the modern era, has produced advocates for geography, institutional design, gender equality, human capital and a host of other explanatory factors as key elements of Western Europe's ascent (Bosker et al. 2013;Allen 2001;Diebolt et al. 2017;Diebolt and Perrin 2013;Broadberry 2013). In contrast, the role of violence has not received much attention, except in the studies on the 'war generates states' hypothesis, which goes back to Tilly et al. (1975): while many influential studies traditionally focused on the strong state as an obstacle to development (Acemoglu et al. 2005), a recent strand of the literature picked up the Tilly et al. hypothesis, arguing that the experiences of war and conflict allowed tax capacities to develop-most notably during the 100 years' War in France-which stimulated innovations in tax collection to finance standing armies (North 2000;Hoffman 2012). A wider set of related studies focused on war as the basis of a state's capacity to tax, arguing that after wars generated taxation states, the resulting state capacity subsequently allowed for more stable development (see, for example, Dincecco 2015;O'Brien 2011;Hoffman 2015). This set of hypotheses is clearly a point of departure for our study.
Our strategy for approaching these questions relies on proxy indicators, as standard indicators of violence and human capital are not available for early periods of European history. Hence, we establish a new indicator that is able to trace the development of elite numeracy over the very long term-the share of rulers for whom a birth year is reported in conventional biographical sources. We reason that a ruler's birth year was regularly reported and entered into historical chronologies only if the elite bureaucracies around those rulers were capable of processing numerical information with ease; otherwise, they were simply forgotten and left unrecorded. Below, we discuss a number of potential biases and reason that they do not invalidate our proxy indicator for elite numeracy. We also report correlations with other indicators of elite numeracy in medieval societies for which both metrics were simultaneously available in the same location. highly numerate, but economic elites were not. However, these social groups were usually highly connected (Mokyr 2005).
As more traditional indicators of education such as literacy rates, school enrolment or age heaping-based numeracy are not available for most medieval European countries, the 'known ruler birth year' proxy allows us to trace elite numeracy in periods and world regions for which no other indicators currently exist.
We assess the validity of this measurement by using insights from alternative sources, only including cases where information for at least ten rulers is available. Most notably, Buringh and van Zanden (2009) traced elite European education through the number of monastery manuscripts that were kept between 700 and 1500 CE, using them to construct a per capita indicator. In Fig. 1, we document the substantial correlation between their proxy measure of elite numeracy and ours for eleven European countries. Although there is naturally a certain amount of variation resulting in some observations deviating from the trend line, the correlation remains highly significant (correlation coefficient ρ = 0.67).
Likewise, we compare our indicator to the rate of 'birth year heaping' in Cummins' (2017) database of European noblemen from 800 to 1800 CE and again find a highly significant correlation ( Fig. 2; here the correlation coefficient is ρ = − 0.58 1 ).

Fig. 1
Manuscripts versus birth known rate (11 European countries, 700-1500 CE). Note Number of monastery manuscripts per million inhabitants (correlation coefficient ρ = 0.67; or ρ = 0.71 where the birth known rate is less than 100%). Source: Buringh and van Zanden (2009) 1 3 Elite violence and elite numeracy in Europe from 500 to 1900… Similar comparisons with another indicator can also be made for China. As another large and fairly stable world region, it can also provide broadly applicable insights into long-run development processes. An early indicator of numeracy and human capital used for China concerns the number of 'literati' among the population. During certain phases of Chinese history, most notably after nomadic invasions, the literati system was of reduced importance. These periods were also characterised by lower elite numeracy rates, as measured by the known ruler birth year proxy and shown in Fig. 3. 2 In sum, the Chinese evidence allows us to complement our comparisons of European monastery manuscripts and 'birth year heaping' with elite human capital in another world region.
To estimate elite numeracy via the known birth year rate for medieval Europe, we had to make certain methodological decisions. For practical reasons, we assign modern country names to the geographic units we study, using the location of historical capitals within modern boundaries as our assignment criterion-as the kingdom's elite mostly lived in these capitals. A large number of studies in economic history have used modern countries as their cross-sectional units of analysis because this approach allows the tracing of long-run determinants, even if it invites a certain Fig. 2 Birth year heaping versus birth known rate (7 European regions, 800-1800 CE). Note Birth year heaping calculated from Cummins' (2017) sample of 115,650 European noblemen (correlation coefficient ρ = − 0.58; or ρ = − 0.54 where the birth known rate is less than 100%). Source: Cummins (2017) degree of measurement error. For example, Maddison (1998) traced post-Soviet economic growth and population counts in former Soviet states back into Soviet times. The Clio-Infra database also allows us to study historical country units using their modern boundaries. If boundaries change, then using modern countries may seem somewhat anachronistic, but the insights gained by analysing the long-term development of these territorial units still provide valuable insights. Nevertheless, for most European countries, such as France, the UK and Spain, modern country borders are broadly compatible with historical boundaries.
If there were concurrent rulers within the borders of modern countries (in smaller principalities, for example), we also assigned them to a modern country according to where their capital was located. 3 The alternative, assigning elite numeracy values to grid cells across Europe, also leads to measurement error because we do not have measurements for all grid cells, only for those containing each capital city. Thus, we cannot measure any difference between grid cells containing capitals and those without. In fact, we could more precisely call our unit of observation the average Fig. 3 Elite numeracy and the 'Literati' (China, 0-1800 CE). Note By 605 CE, China had introduced an unusual system for appointing their bureaucratic elites (Deng 1993). If a candidate succeeded in passing the exam, they became a member of the educational nobility, the 'literati', with considerable social status and a substantial income. Economically, China fared exceptionally well under this system during the medieval period (Baten 2016) 1 3 Elite violence and elite numeracy in Europe from 500 to 1900… elite numeracy of each capital situated in the territory of each modern country. For simplicity, we abbreviate this with the name of each modern country. The main explanatory variables that we assess below also relate to the same modern geographical units described here.

Potential biases of the 'known birth year' indicator
It is conceivable that the 'known birth year' indicator may suffer from potential biases that capture information unrelated to elite numeracy. We discuss these biases below and consider whether or not they are substantial.
1. Ruler biographies, for example, were often only recorded many years after a ruler's death, and the exact sources on which these were based are often unknown. Therefore, factors such as strong research traditions may have contributed to more detailed and complete chronologies of ruler birth years-with chronologists perhaps even calculating them based on significant events that occurred closer to the birth of an earlier ruler. Specifically, countries with strong university traditions such as England, France or Germany might have boasted scholars who created detailed accounts of the medieval histories of their countries, leading to more accurate approximations of birth years that took place centuries later. However, somewhat surprisingly, many of these countries actually had lower known birth year rates in the Middle Ages than, for example, the principalities in today's Iraq, Turkey or Greece (see below and in Baten 2018). Consequently, this notion is incompatible with the view that the research intensity of the last few centuries might have biased the elite numeracy estimates of medieval times. 2. A second potential source of bias is the destruction of city archives, which might have resulted in the loss of previous records. However, royal chronologies were traditionally copied (Hanawalt and Reyerson 1994: 39). Even if one city archive were destroyed, prominent information such as that concerning a ruler would likely have been preserved in other libraries, books and supplementary written media. Moreover, we observe that the proportion of known ruler births often declined over time (Fig. 4). If the destruction of city archives were a core determinant of this indicator, we would have expected near zero values for the earlier centuries, which would suddenly reach high values in later centuries. This does not occur in any of our series. Clearly, we should not assume a linear loss, but if some loss occurred due to the destruction of archives, one would expect some downward bias for known birth years to have occurred. However, we argue that since ruler lists were considered highly important pieces of information, they were usually kept by different people in different places and were therefore not lost after the destruction of one or even several city archives. Victorious invaders were also not necessarily interested in burning all written records, because keeping information about their newly conquered territories was vital. Hence, the burning of city archives was usually isolated and acciden-1 3 tal. Even during the famously brutal Tamerlane 4 invasions not all cities and their archives were destroyed, because certain cities surrendered. Gaining power over cities and territories was Tamerlane's main aim, not destroying them, though destruction did occur in several cases to generate terror (Kunt and Woodhead 1995: 857). 3. Third, and more relevantly for South-Eastern Europe, rulers who assumed the throne after an invasion might have been different from rulers born in the countries that they later ruled. For example, some rulers originated from less numerate, nomadic societies in Central Asia-such as the first of the early Bulgarian rulers. Here, we have to distinguish between a truly lower level of elite numeracy among these rulers and their elites, what we want to measure, and a bias that stems from a lack of information about their births in foreign and possibly distant lands. Being born elsewhere might imply less knowledge about the first generation of settlers, but the second generation should have already undergone a catch-up period in which to learn and record the second ruler's birth year. Therefore, using a sufficient number of cases per period should mitigate any degree of bias that could potentially lead to concern. One famous example of a new political entity formed 0 20 40 60 80 100 8 t h -9t h C e n t u r y 1 0 t h -1 1 t h C e n t u r y 1 2 t h -1 3 t h C e n t u r y 1 4 t h -1 5 t h C e n t u r y  Fig. 4 Examples of decreasing elite human capital 4 Tamerlane was the founder of the late fourteenth and early fifteenth-century Timurid Empire, a shortlived empire that emerged from the remnants of the Mongolian Empire and conquered much of Central Asia as well as the vast area between today's Pakistan and Turkey. Tamerlane, famed for his brutality, described himself as the heir of Genghis Khan-although he was not a direct relative-and sought to reestablish the Mongolian Empire (Chaliand 2004).

3
Elite violence and elite numeracy in Europe from 500 to 1900… after a migration movement was the Bulgarian Empire (on the following, see Shepard 2017). Originating on the plains of West Asia, the semi-nomadic Bulgars moved to the Balkans in several stages. Asparuh was the first ruler of the Bulgarian Empire after settling north of the Byzantine Empire. No birth year is known for him and it seems plausible that the human capital of his early imperial elite was modest, consistent with the above hypothesis. Contrastingly, his successor, Tervel, reorganised the empire. He cooperated with the Byzantines at first, before conflict later took place. Correspondingly, for him a birth year is known. These are individual examples and, hence, only have limited representativity, but they aptly illustrate the considerations above. 4. A fourth possible bias could be that rulers who spent more time on the throne could have better established themselves and their policies, giving chronologists more reason and more time to document their birth years. We control for this potentially biasing effect by including the length of the ruler's reign as a control variable, finding no relationship with the proportion of known ruler birth years (see Table 3). 5. Finally, and possibly the most challenging potential bias to alleviate, the birth years of more famous rulers might have been better recorded. It is conceivable that events in the lives of lesser rulers, who were placed under the suzerainty of an emperor, for example, would be less diligently documented. However, birth years for several of the most famous rulers in world history, such as Charlemagne, were not documented; this is a first hint that 'fame bias' may not have been so crucial. Nevertheless, we can also control for this 'fame bias' to a certain extent by controlling for whether the rulers of each kingdom were under the suzerainty of an overlord. Rulers with a more dependent, governortype function most likely attracted less attention from chronologists. 5 We find, in Table 3, that rulers who served this governor-type function were not significantly different to their overlords in terms of elite numeracy, after controlling for country and century fixed effects. In conclusion, these developments speak against any fame bias under the assumption that fame and suzerainty are related.
Furthermore, we include the area of each kingdom as a second control variable against more famous or powerful rulers being better documented. Although not all powerful rulers held large territories, rulers of powerful kingdoms such as the Holy Roman Empire, the Ottoman Empire, Poland-Lithuania and the Kievan Rus certainly did. Nevertheless, like our indicator for suzerainty, kingdom area does not exhibit any relationship with the proportion of known ruler birth years. Throughout the paper, we compare our regression specifications both with and without these 'elite controls'.

Measuring potential determinants of elite violence
Elite violence could potentially be an important determinant of elite numeracy. Cummins (2017) argues that a substantial share of noblemen in the medieval period died through acts of violence, including kings, and particularly on the battlefield. Given that lifespans and the prevalence of violence are negatively correlatedthough not perfectly, as other factors also influence lifespans-we argue that part of the underinvestment in elite human capital during this early period was caused by lower lifespans. Individuals had had fewer incentives to invest in numerical human capital if they expected to die early. While we measure the murders of rulers, external effects on the kingdom's elite are very likely. The wider elite is also affected by the fear of becoming victims to violence if the ruler is killed-murder, particularly of a central figure, creates an atmosphere of fear in society (on recent evidence of the external effects of murder, see OECD 2011; Baten et al. 2014). Moreover, after the repeated killing of rulers-both in battle and in non-battle situations-specific value systems often developed, typically related to 'cultures of revenge' (Pust 2019). While most inhabitants of wealthy modern societies consider 'blood revenge' outdated and unimaginable, the contemporaries of the fourteenth century, for example, considered it imperative. It was closely related to the 'culture of honour', which led aristocrats to die in duels even as late as in the nineteenth century, attempting to enact revenge for insults or violence against their relatives. The persistence of these cultures of honour has also been studied for the Southern United States (see . Elias (1939) described a long-term process in which societies and elites in particular became less violent over time, adopting and accepting greater state capacities and a culture of increasingly civil, non-violent behaviour. He termed this humankind's 'civilising process'. In societies of high state capacity-or even a widely accepted monopoly of the state to execute violence-returns to investments in education by meritocratic elites were certainly higher. Eisner (2014) argued that the complex interaction between more education and less violence in a society sets a 'swords to words' process in motion, in which potential conflicts were increasingly solved through negotiation rather than violence (Gennaioli and Voth 2015; Pinker 2011). Cummins (2017) finds that increasingly fewer European nobility were killed in battles after 1550 CE. Baten et al. (2018) also studied the history of interpersonal violence in Europe by tracing the proportion of cranial traumata cases among 4738 skeletons that cover the period 300-1900 CE, finding that interpersonal violence remained very high until the late Middle Ages before rapidly declining. Eisner (2011) also collected evidence on 45 European kingdoms, documenting a decline in the rate of regicide over time-regicide being the assassination of kings and other rulers. If killed, rulers were usually the victims of their own families or competing nobility. The rates of regicide and of rulers killed in battles declined strongly between the early medieval period and the modern era (see Keywood and Baten 2018 for an econometric analysis with a strongly expanded European sample and Fig. 5 on regional regicide rates).

3
Elite violence and elite numeracy in Europe from 500 to 1900… To crosscheck the plausibility of our own evidence of declining violence over time, as well as the relationship between elite and population-wide violence, we compare evidence on regicide and homicide for a number of European countries for which Eisner (2014) presented early evidence of homicide rates. In Fig. 6, we can see that both series showed very similar trends across the countries where data are available. Moreover, deviations from the general downward trend also often occurred at similar times (one exception being Italy during the nineteenth century). This strong relationship also validates our use of regicide as a proxy for interpersonal elite violence, discussed in more depth in Keywood and Baten 2018).
Although these subfigures all display strong declines, the panel unit root tests that we run in the Appendix (Table 8) lead us to conclude that regicide, over the whole panel, is a stationary process. Nevertheless, we include time fixed effects as a measure against non-stationarity in our empirical analysis. Finally, temporal autocorrelation does not play a strong role because our main results also hold in first differences (see Appendix, Tables 13 and 14).
For the Middle East, Baten (2018) adopted a similar strategy by analysing the number of rulers who were killed in battles and by other forms of regicide, mostly due to conflicts over who should rule. Interestingly, we found that Europe tends to display diametrically opposite trends to the Middle East. For a large portion of the period that Baten (2018) studied, both battle deaths and murder rates within the ruling houses increased, whereas they declined in Europe, as we describe in detail below.
For the remainder of this paper, we use regicide as our indicator of elite violence. Our regicide data set was initially built using the rulers found in Eisner's (2011) original regicide study, comprising 1513 rulers from across 45 kingdoms. We then strongly expanded this data set with an array of supplementary sources, chiefly Morby's (1989) 'Dynasties of the World' and Bosworth's (1996) 'The New Islamic 0% 5% 10% 15% 20% 25% 30% E a r l y m e d i e v a l ( 6 t h -9 t h C e n t u r y )  We exclude cases of deaths in battle from 'ordinary' regicide because battle deaths are likely to reflect violence driven by external forces rather than the local interpersonal elite violence that we estimate. The concept of battle deaths allows us to take into account these external influences. Admittedly, the two variables are not always perfectly distinguishable, but our definition of battle violence is to be killed in a battle.
Finally, our regicide evidence covers all states, for almost all periods (Table 1). This is not possible for other indicators such as conflict counts. Pinker (2011) studied conflicts over time, arguing that both overall and interpersonal elite violence declined despite the number of conflicts in some countries seeming to increase over time. Accordingly, Pinker criticised simple conflict counts as uninformative due to three different biases. First, the number of casualties per capita needs to be measured accurately, which is not often done. Secondly, the number of conflict victims per capita needs to be quantified, particularly because simple conflict counts are higher in more densely populated countries with larger populations. Thirdly, and perhaps most importantly, psychologists have identified a strong perception bias-we know much more about minor conflicts in Northern France or Germany than, for example, Fig. 6 Regicide versus homicide: evidence for the plausibility of the regicide indicator (Germany, Italy, Spain, UK, 1300-1900 CE). Note The figure shows declines in violence and the relationship between elite violence (regicide, defined as the share of rulers who were killed) and interpersonal violence (homicide per 100,000 population). The grey circles indicate periods during which both homicide and regicide rose simultaneously. Sources: homicide data from Eisner (2014) 1 3 Elite violence and elite numeracy in Europe from 500 to 1900… in Ukraine or in the Balkans during the fifteenth century. Conflicts between neighbouring Ukrainian cities during the late medieval period would probably not have been documented, whereas similar conflicts between two Western German cities, for example, might have indeed been recorded. Our regicide measure has the important advantage that the ruler biographies were systematically available and the denominator is known.

Regional patterns of elite numeracy
When considering regional trends in elite numeracy ( Fig. 7; see Appendix Table 7 for regional classifications), we see that North-Western Europe did not always lead the way. Rather, South-Western Europe led with Iberia and Italy, while South-Eastern Europe had the highest levels of numeracy during the early Middle Ages, driven by the East Roman Empire, although it fell back after the empire's collapse. North-Western Europe was on a more stable growth path, however, taking the lead in the tenth-thirteenth centuries. By the fourteenth and fifteenth centuries, Iberia and Italy had caught back up to North-Western Europe, as described by Broadberry (2013). By then, however, the UK had already reached full elite numeracy under our indicator.
Eastern Europe began the sixth century with approximately 20% of its ruler birth years known, or just slightly lower. Its developmental path for numeracy would occur at a much slower rate, particularly in Romania, where the proportion of known ruler birth years was lower than 5% when its kingdoms began to emerge in the twelfth century. Only later does Romania exhibit a strong growth rate in elite numeracy. In the period between the twelfth and eighteenth centuries, other Eastern European countries lagged significantly behind their North-Western counterparts.
South-Eastern Europe is an interesting case in which we can clearly see the impact of historical developments. 6 Admittedly, we have few observations for the East Roman Empire in the first period (with its capital located in today's Turkey), but our figure (Fig. 7) shows a clear deterioration of elite numeracy during the decline of the Byzantine Empire, followed by stagnation in the years that followed. This stagnation also coincided with various invasions from Central Asia. Finally, South-Eastern Europe exhibited strong growth in elite numeracy after the Great Plague, catching up to both groups of Western European countries by the eighteenth century, a lag of approximately 400 years. Central European trends are not shown here because they have a very high starting point and quickly reach 100%. However, they are presented as a group in Fig. 8, which plots elite numeracy for broader regions in a single figure.
In Fig. 8, two clear patterns emerge within Europe's regional development in elite numeracy. Although it is difficult to confidently assert initial positions in the sixth century, it seems that all regions aside from Central Europe had roughly similar levels of elite numeracy-ca. 40%-around the tenth century, before diverging drastically. While Central, North-Western and South-Western Europe (with a small lag) exhibit strong increases from this point onwards, Eastern and South-Eastern Europe display stagnant or even declining series that only begin to increase during Notes The year is the middle year of each two-century period, 600 for the sixth and seventh century, etc. Abbreviations refer to the following: Benelux (ben-Belgium, Netherlands, Luxembourg); France and Monaco (fra); Scandinavia (sca-Denmark, Iceland, Lithuania, Latvia, Norway, Sweden); UK and Ireland (uki); Caucasus (cau-Armenia, Georgia); Romania (rom); Russia, Belarus and Ukraine (rua); Iberia (ibe-Portugal, Spain); Italy (ita); Greece and Cyprus (gre); Turkey (tur); Balkans (bal-Albania, Bosnia and Herzegovina, Bulgaria, Croatia, Montenegro, Serbia)

3
Elite violence and elite numeracy in Europe from 500 to 1900… the period 1500-1700 CE. Eastern Europe only catches up to Central and Western Europe towards the end of the study period.
Elite numeracy reaches a high plateau in the period 1600-1800 in North-Western, Central and South-Western Europe as all values move close to 100%. If we compare these elite numeracy trends with the general numeracy figures that were recently published by various authors using age-heaping-based numeracy estimates, we observe that this period was not characterised by overall numeracy being close to 100%. For example, Baten et al. (2014) find an overall numeracy far below 100, even for the UK, and Pérez-Artés and Baten find (2020) a much lower one for Spain.
Moreover, the similarity in elite numeracy trends of neighbouring regions makes our estimates more plausible. For the remainder of our analysis, we will revert to country-level units instead of the regional level used in the figures above. The advantage of using more aggregated units for figures is that we obtain smoother trends, while this is less important for regression analysis. When using regional units, we find the same overall regression results, but they are less robust due to smaller sample sizes (see Appendix Table 16 for a robustness check at the regional level).
We study a very long time frame of elite violence and elite numeracy in this paper and it is quite likely that the relationship between the two variables may have changed, especially as military technology transformed, state organisation developed and the intensity of nomadic invasions varied. Hence, we look at a series of scatterplots, first separating the study period by the first three centuries (sixth to eighth centuries) and then bicentennial periods thereafter (ninth and tenth centuries, eleventh and twelfth century, etc.; Figures 8,9,10,11 and 12). We invert violence into Fig. 8 Inter-regional trends in elite numeracy. Note The legend refers to Central Europe (ce), Eastern Europe (ee), North-Western Europe (nw), South-Eastern Europe (se) and South-Western Europe (sw)

Fig. 9
Elite numeracy and non-violence (sixth-eighth century). Note Scatterplot weighted by observations. Labels refer to countries (see Appendix Table 7 for country codes) and centuries

Fig. 10
Elite numeracy and non-violence (ninth-tenth century). Note Scatterplot weighted by observations. Labels refer to countries (see Appendix Table 7 for country codes)

3
Elite violence and elite numeracy in Europe from 500 to 1900…

Fig. 11
Elite numeracy and non-violence (eleventh-twelfth century). Note Scatterplot weighted by observations. Labels refer to countries (see Appendix Table 7 for country codes)

Fig. 12
Elite numeracy and non-violence (thirteenth-fourteenth century). Note Scatterplot weighted by observations. Labels refer to countries (see Appendix Table 7 for country codes) 1 3 'non-violence', as this makes the graphic easier to read. The relationship between elite non-violence and elite numeracy is already clearly visible in the eighth century, with Spain (es) holding one of the highest elite numeracy values when Al-Andalus had reached its peak (Fig. 9). In contrast, Spain had some of the worst values in terms of elite violence and elite numeracy under the west Gothic rulers of the sixth century. The decline of the East Roman Empire (tr) is also apparent here. Russia (ru) and Ukraine (ua) were more extreme, with all rulers in Russia being killed violently.
The following period was characterised by the Hungarian invasions that affected large parts of Europe as well as more localised conflicts in South-Eastern Europe ( Fig. 10; the Arabic and Bulgarian invasions of the East Roman Empire (tr), for example, and the Vikings in the north-west). Muslim Spain (es) was still among the low-violence and high-numeracy cases, as was the Holy Roman Empire (de); although the population suffered terribly from Hungarian invasions, the Emperors were not killed. Ukraine (ua), and the states and principalities in the area of modern Turkey (tr) suffered the most.
The following period of the eleventh and twelfth centuries had no major nomadic invasions (rather, European states invaded in the Middle East), and European principalities reached a greater stage of feudal development ( Fig. 11; also see Hehl 2004). We observe that the relationship between elite violence and elite numeracy was weaker during this 'high medieval peace period'-meaning that violence was less detrimental for overall development. This changed during the thirteenth and fourteenth centuries with the arrival of nomadic Mongolian invasions. During this period, the impact of violence was larger again, as can be seen by the slope of the regression line (Fig. 12). During this period, state organisation continued to develop and France made particularly strong progress in tax institutions during the 100 years' War (North 2000). 7 Finally, during the fifteenth and sixteenth centuries, we observe an even stronger east-west disparity. A cluster of Western and Central European countries had almost no elite violence at this time, along with near complete rates of elite numeracy. In contrast, Ukraine (ua), Albania (al) and other Eastern and South-Eastern countries lagged far behind during this period. Some outliers combine high violence and low numeracy (see Cyprus [cy], Luxembourg [lu], etc., in Fig. 13), but these were small principalities with lower observational densities. During this period, the new, resource-intensive city protection of the 'Trace Italienne' began to require increasingly greater tax resources for military success (Gennaioli and Voth 2015). Western powers such as Britain (uk), France (fr) and the Netherlands (nl) were better suited to develop these tax capabilities and the evidence from regicide and battle deaths suggests that this resulted in a decline of violent deaths among the elite.

3
Elite violence and elite numeracy in Europe from 500 to 1900… 8 As a precaution, the full fixed effects specification from Sect. 6.1 is repeated using the predicted values for the known-birth indicator in Appendix 4. Although some of the coefficients change marginally, all of our conclusions remain the same.

Empirical analysis
The independent variables used in this analysis fall into two distinct groups: those that control for potential biases that may cause the known ruler birth year indicator to diverge from a 'true' measurement of elite numeracy, and those that constitute explanatory variables-variables that help to assess the potential impact of elite violence on elite numeracy (Table 2).
Because a longer reign may provide greater opportunity for chronologists to record a ruler's birth year, we control for the average length of reign across each country and century. To control for the power and influence of each kingdom, we use their areas in square kilometres (Nüssli 2010) as well as whether rulers had the freedom to act and set policy autonomously, as opposed to being under the suzerainty of an overlord. Table 3 shows that neither reign length nor autonomy significantly affects the likelihood of a ruler's birth year being recorded, although kingdom area becomes marginally significant when other explanatory variables and controls are included. 8 Our first explanatory variable, apart from regicide, is the 'proportion of rulers killed in battle'. This variable provides information on civil wars and external military pressures on each kingdom, which may have affected elite numeracy through

Fig. 13
Elite numeracy and non-violence (fifteenth-sixteenth century). Note Scatterplot weighted by observations. Labels refer to countries (see Appendix Table 7 for country codes) the destruction of educational infrastructure or reduced incentives to invest in elite numeracy due to lower life expectancies (Cummins 2017). Moreover, battle deaths and regicide are correlated, meaning that excluding battle deaths as a control variable could lead to an overstatement of any effect of regicide on elite numeracy.
Urbanisation rates are widely used in the economic history literature and act as a broad control variable for factors that could confound the relationship between elite violence and elite numeracy. They have also been employed as a proxy indicator for income among early societies in which other income proxy data are unavailable (Bosker et al. 2013;De Long and Shleifer 1993;Acemoglu et al. 2005;Nunn and Qian 2011;Cantoni 2015). Bosker et al. (2013) hypothesise that part of this relationship works through agricultural productivity, because a productive agricultural sector is required to support a large urban centre and urban areas cannot produce their own agricultural goods. We admit that, as urbanisation may be endogenous, there may be a trade-off between including an endogenous control and allowing omitted variable bias to enter the model. Therefore, we only include urbanisation in a subset of regression models. We also introduce a measure of institutional quality as a potential determinant of elite numeracy. Our indicator for this is the mode of ruler succession, as this captures a certain preference for the division and limitation of dynastic power. 9 We use a three-category indicator to describe whether a ruler obtained their position through inheritance, partial election or full election by the aristocracy (as in Venice, for example). 10 The differences in institutional quality between states, seen through modes of succession, are not as large as those between democracy and autocracy, but evidence on democratic structures does not exist for the earlier periods under study here. However, a preference for the division of power reduces the likelihood of unconstrained totalitarianism. Again, we expect this aspect of institutional quality to be positively correlated with elite numeracy. Next, we use estimates of pastureland area from Goldewijk et al. (2017). We transform the variable to pastureland per square kilometre per capita and then standardise it to a [0, 1] index. Motivation for including this control is that pastureland provides nutritional advantages, and improved nutrition is known to have positive implications for human capital (Schultz 1997;Victora et al. 2008). Second, numerous studies have used pastureland and pastoral productivity as means of estimating female labour force participation, providing information on female autonomy and gender inequality, and perhaps elite human capital as a result (Alesina et al. 2013;Voigtländer and Voth 2013;Baten et al. 2017). This mechanism functions through women's comparative physical disadvantage, relative to men, when ploughing fields and performing other tasks required for crop farming. Over time, this tendency developed into a social norm that saw men work in the fields while women took care of 'the home' (Alesina et al. 2013). However, when cattle and other domestic animals were present, their care became the task of women, boosting female labour participation and the contributions of women to household income. With increased income contributions, female autonomy increases and gender inequality is reduced, allowing women to develop their own human capital and contribute to economic development (Diebolt and Perrin 2013).
Fourth, as a counterweight to the pasture variable, we also use cropland. Like pastureland, cropland should describe agricultural and nutritional development but should also emphasise gender inequality for the reasons just mentioned. Therefore, its coefficient should be positive if nutrition, in terms of calories, is more important for elite numeracy, and negative if gender inequality is. The cropland variable is also transformed into per square kilometre per capita terms and then standardised (Goldewijk et al. 2017).
Last, we include a variable for the second serfdom to assess whether the inequality that it wrought had any impact on elite numeracy in Eastern Europe. This is coded as a dummy variable for all of Eastern Europe from the sixteenth until the eighteenth century and until the nineteenth century in Russia, where serfdom was only officially abolished under Tsar Alexander II in 1861.

Fixed effects specification
We undertake an empirical analysis that consists of two parts. We first employ a fixed effects specification to test the existence and robustness of the relationship between elite violence and elite numeracy before implementing an instrumental variable strategy, endeavouring to find a causal effect of elite violence on elite numeracy. 11 The fixed effects specification is set up as follows: (1) elite human capital it = i + t + 1 regicide it + 2 battle deaths it + k it + it 11 We also conduct spatial regressions (Appendix 3) to uncover the effects of spatial autocorrelation.

3
Elite violence and elite numeracy in Europe from 500 to 1900… Table 4 Fixed effects regressions Numbers in bold text indicate coefficients that are significant at least at the 10% level The reference category for institutional factors is hereditary succession; for 'second serfdom', it is the regions and periods not affected. Since there are 36 clusters when clustering by country, we also crosschecked our results using Cameron et al.'s (2008) wild bootstrap procedure (using 1000 replications). We find very similar results to Table 4 and regicide and battle always remains significant, at least at a 98% confidence level (t-statistics from − 2.58 to − 3.64 and corresponding p values from 0.019 to 0.000)

Standard errors clustered by country
Robust standard errors in parentheses ***p < 0.01; **p < 0.05; *p < 0.1 where α i are country fixed effects, γ t are century fixed effects, ψ it is a vector of the control variables described above and ε it is an error term that captures time-variant unobservables. We also make use of clustering at the country level, as it would be unrealistic to assume that within-country observations are entirely independent of one another, and estimate robust standard errors. We also use bootstrapped standard errors by employing the wild bootstrap procedure of Cameron et al. (2008, see notes to Table 4).
We immediately see that both the regicide and battle death indicators enter into each regression model significantly and with a negative coefficient (Table 4). These coefficients are also fairly stable across our specifications, implying that our control variables are less important for elite numeracy than violence. The coefficient for regicide remains between approximately − 0.41 and − 0.50, which can be interpreted as a one percentage point increase in regicide being associated with a 0.41-0.50 percentage-point decrease in the rate of known birth years. Alternatively, a one standard deviation increase in elite violence is associated with a 7.2-8.8 percentage point decrease in elite numeracy, which is a substantial effect. However, in the same way that violence could have acted as a restraining factor on the growth of elite numeracy over time, it is also possible that causality runs in the other direction.
Like regicide, the battle indicator also yields significant and negative coefficients that are robust to the introduction of control variables. These coefficients are approximately one third larger than those for regicide (in absolute terms) and fall between approximately − 0.69 and − 0.73. However, the distribution of battle death frequency is narrower than that for regicide, meaning that a one standard deviation increase in battle deaths is associated with a 5.6-6.0 percentage point decline in elite numeracy.
None of the control variables appear to have significant impacts in estimating elite numeracy after including both country and century fixed effects, although the results for pastureland and cropland (proportions per square kilometre, per capita) are still interesting. In isolation, neither of these variables enters into any of the regressions significantly; however, together they reveal drastically disparate results. If either the cropland or pastureland variables had significantly and positively entered into regressions four and five, this would have provided evidence for the hypothesis that nutrition improves numeracy and human capital. This is not the case here, but because the coefficient for pastureland is significantly positive while the coefficient for cropland is significantly negative when the variables enter together in regressions six to eight, this may have implications for gender inequality in accordance with the Alesina et al. (2013) and  hypothesis. Consequently, this result also hints that improved gender equality may have fostered elite numeracy in Europe. In Appendix 9, we also report the results of a corresponding random effects specification. Notable extensions to the fixed effects model are that the Jewish minorities impacted on elite numeracy as well.
Residual scatterplots allow us to compare our dependent variable and independent variable of interest more directly. We first run our standard fixed effects regression from Table 4 while omitting elite violence, saving the residual elite numeracy, and then regressing elite violence on the other explanatory variables (not including elite numeracy) to save residual violence. 12 Figure 14 shows the relationship between the residuals of both regressions, allowing us to conclude that the controlled relationship between elite numeracy and elite violence is indeed strongly negative. This also allows us to conclude that the results are not driven by a small number of outliers.
We must acknowledge the role that spatial autocorrelation may have played (see also maps in Figs. 15,16,17). Kelly (2019) recently argued that many results

3
Elite violence and elite numeracy in Europe from 500 to 1900…   Elite numeracy (1600-1900 CE, adjusted bin widths). Note The known ruler birth year measurement means that elite numeracy was consistently high by the early Modern Period (most countries are dark in Fig. 16, panel d for . This bin-width adjustment merely allows for a clearer distinction between countries. The darker colours exhibit greater elite human capital autocorrelation are not controlled. Therefore, we make use of spatial econometric techniques first formalised by Paelinck and Klaasen (1979) in Appendix 3. The results from these spatial regressions provide remarkably similar results to those from the fixed effect model (Eq. 1). Hence, spatial autocorrelation does not seem to be a notable source of endogeneity in this study.
Elite violence and elite numeracy in Europe from 500 to 1900… 13 He reanalysed White's (2011) list of "death tolls of wars, massacres, and atrocities" by deflating the number of victims of each event by the population of each respective century. Pinker argued that with a larger population, more victims are likely. Deflating by population, the wars of the twentieth century are still among the most terrible atrocities, but are less exceptional. The Mongolian invasions were the most influential of all nomadic invasion-related events (ranked second of all atrocities in human history). Other events related to nomadic invasions included the end of the Ming dynasty in China (and the Manchurian invasion related to it) as well as the end of the West Roman Empire (and the Hunnic and Germanic invasions related to it; see Pinker 2011). 14 The division of the Mongolian Empire that had offered Yury military aid.

Instrumental variable specification
Although the fixed effects regressions (and spatial regressions) provide a robust assessment of the conditional correlations between elite violence and elite numeracy, endogeneity in the form of simultaneity could still exist. Accordingly, we use an instrumental variable analysis to circumvent this endogeneity issue and assess whether any causal effects exist. Clearly, finding suitable instruments for the medieval period is a substantial challenge, but certain events that took place had the characteristics of 'natural experiments'. We use the nomadic invasions from Central Asia because their origins were determined by climatic forces-mainly droughts in Central Asia (Bai and Kung 2011)-and by military capacity.
Pinker (2011) found that the major nomadic invasions represented three of the six most violent and victim-intensive events in all of human history. 13 For European history during our sixth to nineteenth century timeline, the Hungarian and Mongolian invasions were the most influential. Although other invasions (the Arab-Berber invasions of Spain, the Bulgarians, the Vikings, and the Seljuks/Ottomans and others) were also relevant, they were more localised. Here, we analyse how these invasions affected European elites.
First, some of the nomadic invaders created new vassal states in their newly conquered territories, often leading to additional conflicts because local elites disputed the legitimacy of their regimes (Fennell 1986). For example, the Mongolians set up client rulers and partially dependent rulers in Eastern and South-Eastern Europe. Yury, the prince of Moscow, even received military support from the Mongolians when trying to conquer Tver, Russia, in 1317 (see Fennell 1986 on the following): after being defeated, Yury was called to the 'Golden Horde' 14 to be put on trial for his failure. Before any inquiry could take place, he was killed by Dmitry 'the Terrible Eyes', the son of Mikhail of Tver. Dmitry was later executed by the 'Horde' himself. In sum, the behaviour of the rulers under Mongolian suzerainty was unusually violent (Fennell 1986).
Secondly, after the nomadic invaders had killed several European rulers, the psychological hurdles for Europeans to assassinate their own rulers had been lowered. Previously, particularly during the High Middle Ages, the lives of rulers were accepted as sacrosanct more widely than before or after (see Hehl 2004; there were exceptions, of course). During the thirteenth and fourteenth centuries, rulers were often killed by their own knights or other personnel, and not only by competing nobility or neighbouring rulers. For example, Richard Orsini, the count of Cephalonia, was killed in 1303 by one of his own knights (Nicol 1984).
Thirdly, the manner of killing rulers changed dramatically after the nomadic invasions. In the medieval period, death by sword was considered more honourable and appropriate for rulers, whereas many other ways of killing were reserved for criminals. That rulers were subjected to alternative means of killing was initially inconceivable. For example, the Byzantine historian and chronicler Leo the Deacon describes the death of Igor I of the Kievan Rus with some horror: 'They [a neighbouring nomadic tribe] had bent down two birch trees to the prince's feet and tied them to his legs; then they let the trees straighten again, thus tearing the prince's body apart' (Kane 2019). As another example, Aleksandr of Tver was quartered in Sarai in 1339 (Fennell 1986).
Fourthly, and with a long run impact, taking revenge rose in cultural value. The traumatic impact of the additional frequency of violence against rulers produced psychological responses from the upper classes, forming a 'culture of revenge' which was applied if they felt that their honour had been violated (Pust 2019). This 'culture of revenge' phenomenon was most persistent in Eastern and South-Eastern Europe. One act of revenge spurred the next, and the increase in the cultural value of taking revenge became a strong hurdle against development. In societies that favour revenge, trust of foreigners also develops at a slower rate (Pust 2019).
In conclusion, this 'natural experiment' of nomadic invasions first increased the existing levels of violence, as many individual examples show. Several mechanisms were at work and not all of these examples took place on the battlefield. Even more effectively, the trauma from violence had a relatively persistent effect via the development of a 'culture of revenge', particularly in Eastern and South-Eastern Europe.
The Hungarians, Mongols, Huns and other equestrian-driven nomads had a distinctive style of warfare. The secret to their success was the combination of horsemanship, mounted archers and the incitement of terror against civilian populations (Adshead 2016). Their military efficacy was often so superior that even Europe's strongest empires were unable to protect their constituents. For example, the Holy Roman Empire was helpless against Hungarian raids for more than a century, and it took them almost two centuries to defeat the Hungarian armies at the Battle of Lechfeld in 955 CE (Bowlus 2006). Likewise, in the thirteenth century, the powerful and by then European Kingdom of Hungary offered little resistance to Mongol invasions (Sinor 1999). 15 How did these nomadic invaders succeed against Europe's strongest empires? Military historians agree that their equestrian-based military tactics were the most critical factors (Sinor 1999). Central Asia was the world's equine capital at the time. It has been estimated that by approximately 1200 CE, half of the world's horse population was based between what is today Eastern Russia, Mongolia and the Ural mountains, whereas only a tiny fraction of the world's human population resided 1 3 Elite violence and elite numeracy in Europe from 500 to 1900… there (Adshead 2016: 61). Each Central Asian warrior could therefore possess up to 15 or 20 horses (Adshead 2016: 61), providing easy remounts each time a horse was wounded. Complimentarily, these nomads were expert archers and military strategists. For example, they employed the 'Parthian shot', which was a Parthian military tactic of mounted archers firing at their enemies while in actual or staged retreat. The manoeuvre became famous when used against the Roman Empire in the first century BCE, a particularly noteworthy example being the defeat of the Romans by the Parthians at the Battle of Carrhae in South-Eastern Turkey-on the border of the Roman and Persian Empires in 53 BCE (Mattern-Parkes 2003).
The innovative equestrian strategies and the bowmanship of the Asian nomads were impressive and could have been emulated by European armies, but the strength of their cavalry, with 15-20 horses per warrior, could not be provided by Europeans at the time.
Inciting terror was also a tactic used by many armies before then, but only in combination with the speed of horses was it so exceptionally effective. On the other hand, the unique military supremacy provided by their horsemanship and the sheer number of horses that they possessed resulted in geographic constraints that we use for our instrumental variable strategy. Short campaigns to Italy, France or North-Central Europe were possible, but Central Asian invaders quickly returned to the sparsely populated regions of Eastern Europe or to Central Asia itself. For example, the Mongols suddenly left for the Russian Steppe in 1242 after conquering most of East-Central Europe (Sinor 1999). As a consequence, the closer a European territory was to Central Asian and Eastern European horse bases, the larger an 'import of violence' it experienced. As a reaction to frequent raids and terror, Eastern and Central European societies militarised and favoured power and values such as loyalty over mercantile activities or trade. Hence, we can use the distance to Central Asia as an instrument for the additional violence that was imported through these nomadic invasions.
Clearly, the Hungarians and Mongols were not the only groups that spread violence over such large distances. 16 The Viking raids of the ninth and tenth centuries, the Arab-Berber invasions of Iberia and parts of Italy, as well as the Ottoman invasions in the Balkans-to name just a few-added to European violence too. However, we argue that these activities were more localised, whereas Central Asian nomads affected almost all of Europe. Moreover, it is unclear that the Muslim rulers of Spain were more violent than Spain's earlier Gothic rulers (Pérez Artés and Baten 2020). Likewise, although the Vikings were far more violent than the incumbent inhabitants of the lands that they conquered, historians have explained that their reputation was, to a degree, overstated by monks in Western European monasteries who sought to disseminate propaganda against the 'mighty heathens of the north' (Winroth 2014). Winroth (2014) adds that since the victims were from societies more literate than themselves, Viking raids constitute a rare historical case where history was not written by the 'victors'. Additionally, the Vikings began to settle in the UK and Normandy well before 1050 and ceased their tradition of raiding (Griffiths 2010).
Because we use these nomadic invasions from Central Asia as an instrumental variable, endogeneity could result from heterogeneous levels of economic development along the east-west gradient. However, we observe that this gradient is a feature of the last few centuries and does not exist for the early medieval period. We have seen, in Fig. 8, that elite numeracy was highest in South-Eastern Europe during the sixth to seventh centuries, when the East Roman Empire was the gravitational centre of European development. The second highest levels at the time were found in South-Western Europe, particularly in Italy. The economic dominance of Europe's north-west only arose later, during the period when Eastern and Central Europe were affected by the Hungarian invasions. Indeed, the East Roman Empire was not overwhelmed by the Hungarian invasions, although much of its economic base in the Balkans was devastated. Furthermore, the Roman occupations of Gaul and Britain did not cause an east-west divergence in the early medieval period, according to our evidence. Figure 18 supports this line of reasoning through the coefficients from regressions of elite numeracy on longitude over time. 17 Here, we see that being further east was actually associated with higher elite numeracy during the early Middle Ages and that the traditional, negative gradient effect is reduced (and insignificant) during the high medieval peace period.
In sum, a strong east-west gradient did not exist before the period of the Hungarian invasions but developed thereafter. The strongest emergence of an east-west gradient arose after the Mongolian invasions ceased during the fourteenth century. During this period, our instrument loses its econometric value, as the gradient would have become correlated with factors associated with the stronger economic development of the west. Therefore, we argue that for much of the formative period of Europe's path-dependent processes in the Middle Ages, the nomadic invasions from Central Asia are a suitable instrument for violence. 17 Longitude measured by geographic centroids for modern countries from Donnelly (2012). i.e. that being further east was associated with higher levels of numeracy. When the coefficient is negative, being further west was associated with higher levels of numeracy. Panel A refers to regressions for each century, whereas Panel B uses two-century time periods to show smoother trends Elite violence and elite numeracy in Europe from 500 to 1900… European history offers a placebo test for studying the exclusion restriction of our instrument: The period between the respective episodes of invasions by the Hungarians and Mongolians, namely the High Middle Ages of the eleventh and twelfth centuries. Europe did not experience any major invasions at this time (instead, it acted as an aggressor by invading the Middle East during the Crusades). Cummins (2017) provides some initial evidence for the high medieval peace period when analysing his database of noblemen. He shows a small but clear decline in battle deaths as well as a corresponding increase in average lifespans at the time, which sharply reversed as the Mongol invasions begun and again as the Great Plague took effect. Hence, the proximity to Central Asia should be unimportant for violence during this high medieval peace period, given the absence of nomadic invasions, which would also provide additional evidence against any simple east-west effect.
Before we execute our IV regressions, we need to consider other potential factors that could prevent our instrument from meeting the exclusion restriction. Specifically, our instrument becomes invalid if any characteristics of the nomadic invasions that are not associated with military or interpersonal violence affected elite numeracy in Europe. Such characteristics are not immediately apparent, but, for example, any diseases that the nomads brought with them could have influenced numeracy and human capital through demographic channels. However, we find no evidence of this. The Justinian Plague ravaged much of South-Eastern Europe and parts of the Middle East from the sixth to the early eighth century, but this was clearly before the period of the Hungarian invasions. Likewise, the Great Plague erupted in the midfourteenth century, approximately 150 years after the Mongols had begun invading Europe. Therefore, the spread of diseases from Central Asia can only have had a very indirect effect on elite numeracy at most. Another potential factor that could violate the exclusion restriction is the transfer of technological ideas from Central Asia to Europe, brought by the nomads. Again, we cannot find any obvious examples. As discussed earlier, the horse and bow were already widely used throughout Europe by the time of the first nomadic invasions, and military tactics such as the 'Parthian shot' had already been known in Europe for centuries.
In Table 5, we treat the three periods 800-1000, 1000-1200 and 1200-1400 CE separately and run the following instrumental variable specification, restricting our sample to each of the three periods mentioned above: First stage: Second stage: where proximity it is the logged inverse distance to Central Asia, ψ it is a vector of control variables, α is a constant and ε it is an error term that captures the effects of any unobservables. Admittedly, the number of cases in each period is small, but this should bias the tests towards insignificance. Instrumented regicide exhibits a significantly negative effect on elite numeracy during the two invasion periods of the Hungarians and Mongolians, circa 800-1000 CE and 1200-1400 CE, respectively. During the High Middle Ages, when no Central Asian invasions occurred, the relationship between elite numeracy and the invasions from Central Asia becomes insignificant. Although the absence of significance does not rule out the existence of a relationship, this result hints that our IV only influences elite numeracy through violence during the invasion periods. Additionally, this result disputes the possible criticism that our IV only captures the east-west development gradient of more modern times. As such, it provides tentative evidence (despite the small N) of a causal impact of elite violence on elite numeracy.
In Table 6, we pool all evidence on nomadic invasions from Central Asia in the periods 800-1000 and 1200-1400 as an instrument, including all explanatory variables that have been identified before, finding negative and significant coefficients for regicide. We again find a positive and significant coefficient for more participative political systems as well as our pasture variable, while we find a negative and significant coefficient for our crop variable.

Conclusion
In this study, we provide a 1400-year overview of elite numeracy in European history, using the share of rulers for whom a birth year was recorded as a new indicator. We carefully evaluate this measure, finding high correlations with other proxies for elite numeracy as well as dramatic shifts in elite numeracy throughout Europe.
(2) regicide it = + 1 proximity it + it (3) elite human capital it = + 1r egicide it + 2 battle deaths it + k it + it Elite violence and elite numeracy in Europe from 500 to 1900… The south-east was the first region to undergo transformation from a low elitenumeracy state, led by the East Roman Empire (Fig. 8). Shortly afterwards, the south-west was slightly superior. All European regions displayed comparable rates of elite numeracy around the year 1000, while North-Western and Central Europe did not begin to exhibit their divergent patterns before the High Middle Ages. After this period, both the east and south-east entered into decline, and by 1400, a development path was firmly established that divided the east and the west of the continent. Iberia and Italy grew to similarly high levels as the north-west during the renaissance period. This study has strongly expanded our knowledge about elite numeracy, which Squicciarini and Voigtländer (2015) and Baten and van Zanden (2008) found to have had a strong impact on the little divergence in terms of European incomes. While Squicciarini and Voigtländer concentrated on French regions, Baten and Van Zanden could only include Western Europe (before 1750) and had to leave the Eastern European landmass for later studies. In contrast, this study extends our knowledge to the Ural Mountains and the Caucasus. Moreover, the beginning of the little divergence in elite human capital can be traced back to the High Middle Ages, while European values were relatively similar before this period; with Europe's south-east and then its south-west leading in the earliest periods.
We also assessed a number of potential explanatory variables that might either determine or interact with elite numeracy. For example, the existence of a substantial Jewish minority is associated with greater elite numeracy-what we observe might be external human capital effects from Jews to the Christian elite. Finally, regions that specialised in cattle farming developed greater elite numeracy than grain-intensive regions, although this variable only becomes significant when both agricultural specialisations (cattle and crops) are included in our estimations simultaneously. A growing body of the literature finds a relationship between agricultural specialisation in animal husbandry and the relatively strong position of women economically, which might also have influenced the upper tail of numeracy and human capital.
A consistent and significant negative correlation is observable with violenceboth violence during battles and 'ordinary', interpersonal violence among the elite. We also employ a relatively exogenous import of violence from the Central Asian nomadic invasions of ca. 800-1000 and 1200-1400 as an instrumental variable because these invasions acted contagiously and motivated additional intra-European violence. Interestingly, Europe did not experience invasions from Central Asia during the High Middle Ages, and European numeracy did not follow any east-west pattern at this time (Fig. 16, panel b). By using the 'natural experiment' characteristics of the nomadic invasions, we observe casual effects from violence to elite numeracy. This is a crucial finding for understanding the divergence movement in Europe's developmental history.
Our research is related to a number of studies that focused on war as the basis of a state's capacity to tax, and Tilly et al. 's (1975) 'war-generates-states' hypothesis in particular (see, for example, Dincecco 2015; O'Brien 2011; Hoffman 2015). As our study finds that elite violence was rather a development hurdle during the medieval and early modern periods, a certain tension arises. How can these seemingly contrasting views be reconciled? Can we gain additional theoretical insights from this incongruity? We agree that state capacity had positive effects, in general, as Dincecco and Katz (2014) have shown. However, three facts were crucial: firstly, wars might have been the trigger rather than the underlying reason for developing tax capabilities. The famous example of France's development of tax capacity during the Hundred Years' War first took place in a country that had already developed low elite violence and high elite numeracy in earlier periods, as we showed above, preparing a more serviceable environment for state capacity. The trigger of the devastating war with England convinced the French nobility that permanent taxation would be necessary, but this would not have been possible in another setting with a similarly devastating war, in Bulgaria during the thirteenth and fourteenth centuries, for example. Secondly, as tax-financed military expenditure increased the defensive abilities of states, they became able to avoid military conflicts on their own soil. For example, Britain did not experience many invasions after 1066 and most of its interstate conflicts were executed on foreign territory. Similarly, France had many military conflicts on German soil and in other countries between the Hundred Years' War, ending in the fifteenth century, and the late nineteenth century. The Netherlands mostly initiated maritime wars after building the capacity to tax during the sixteenth century. Hence, the general population of these states with high tax capacities arguably did not suffer as much from war, nor did the local elites. Thirdly, the changes in military technology that took place during the early modern period required tax capacity-emphasizing gunpowder and the 'trace italienne' style of city fortification-but they also protected both the general population and elites better than characteristics of medieval styles of warfare ever did (Gennaioli and Voth 2015). For these three reasons, the results of our study expand and partly resolve the 'war-generates-state-which-then-allows-development' paradox.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.

Appendix 1: Regional classifications
Since there are no universal standards for assigning countries to European sub-regions, some of our classifications may seem unorthodox. However, in these cases their allocations follow historical narratives. For example, some may suggest that Lithuania and Latvia be defined as Eastern European countries because of their shared histories with the Russian Empire and the Soviet Union, or else Central Europe because of their participation in the Kingdom of Prussia or the Polish-Lithuanian Commonwealth. However, being countries that were heavily influenced by Baltic trade and by the Swedish Empire in the seventeenth and eighteenth centuries, we assign them to Scandinavia as a compromise. Moreover, they exhibit trends that are more in line with Scandinavia than either Eastern or Central European countries. These include high rates of regicide in the High and late Middle Ages before exhibiting a sharp decline, as well as early development in elite numeracy (Table 7).

3
Elite violence and elite numeracy in Europe from 500 to 1900…

Appendix 2: Unit root tests
Although all of our regression specifications include time fixed effects, the presence of non-stationary series may mean that our regressions capture spurious relationships and invalidate our inferences. Since we have an unbalanced panel with gaps in certain individual time series, a unit root meta-analysis, such as a Fisher-type test, needs to be carried out. We use both the Augmented Dickey-Fuller and the Phillips-Perron tests before conducting our Fisher-type meta-analysis. Table 8 shows that, among our variables of interest, only elite numeracy and battle deaths display any kind of non-stationarity, and only with a 200 year lag or longer. Since we use century fixed effects, unit roots should not have affected our results. Of course, variables like urbanisation rates are non-stationary by nature, but these are only used as control variables in this study.

Appendix 3: Spatial regressions
As mentioned in the main text, while the results from our fixed effects specification provide a solid point of departure for our co-evolution hypothesis, we must acknowledge the role that spatial autocorrelation may have played. Kelly (2019) recently argued that many results in the persistence literature could have arisen from random spatial patterns and that the likelihood of this problem is greater if the effects of spatial autocorrelation are not controlled. Our study is less affected by this issue because our explanatory and dependent variables are coded for contemporaneous time units, but we still need to control for spatial autocorrelation. Spurious relationships may form due to numeracy or violence spillovers rather than as a result of truly economic interactions. Here, we make use of spatial econometric techniques, first formalised by Paelinck and Klaasen (1979), to combat these effects, which may be particularly important in our study because disparities in levels of development between Eastern and Western Europe could conceivably have driven our earlier results. We first constructed an inverse distance weighting matrix based on the coordinates of the geographic centroids of our geographical units from Donnelly (2012). In this way, our models control for spatial effects in a linear manner-with neighbouring countries having a greater weight than those further away-as opposed to only capturing the effects of immediate neighbours or using an alternative system with an unequal weighting mechanism that reflects historical characteristics, for example.
Because spatial methods require a weighting matrix to link each observation of the dependent variable to every contemporaneous observation from a different geographical unit's dependent and independent variables, they require strongly balanced panels. Unfortunately, as with most studies in social science, we do not have a perfectly balanced panel and must resort to an alternative strategy. This is a common problem in the spatial econometrics literature, with researchers either having to drop all panels with any missing data whatsoever or having to revert to imputation (for sources on multiple imputation in spatial econometrics, see Griffith and Paelinck 2011;Griffiths et al. 1989;Bihrmann and Ersbøll 2015;Stein 1999;LeSage and Pace 2004;Baker et al. 2014;among others).
To perform our imputation, we used Stata's mi command with its multivariate regression option, using this statistical simulation technique to effectively create 50 new data sets of predicted values for each panel. The following analysis is then performed on each simulated data set separately before the results are pooled using Rubin's Rule (1987).
According to Rubin (1987), these estimates afford valid inferences despite the increased sample size of the underlying analysis, provided that data are missing at random. Because the availability of our data improves over time and is itself associated with development in numeracy, as discussed above, we cannot make this claim. Therefore, before proceeding with our imputed spatial analysis, we first run the following models on the two panels where we have the most observations, 1300 and 1400 (Tables 9, 10), observing results that are remarkably analogous and lead us to believe in the validity of our imputed spatial results.
Our spatial analysis utilises the three most simple spatial econometric models, the spatial autoregressive model (SAR Model; Eq. 4, Table 11), the spatially lagged X model (SLX model;Eq. 5,Table 12) and the spatial error model (SEM; Eq. 6, Table 11).
where y it is a vector for the elite numeracy variable in time period t; X it is a matrix of all time-varying regressors for time period t; a i is a vector of country fixed effects; ε it is a vector of spatially lagged errors; u it is a stochastic error term; W is an inverse distance weighting matrix constructed using the coordinates of modern geographic (4) Elite violence and elite numeracy in Europe from 500 to 1900… Table 9 Spatial regression without interpolation (cross section: 1300) Numbers in bold text indicate coefficients that are significant at least at the 10% level Although the regicide coefficients in the first five specifications are imprecisely measured due to a very small sample, the sign remains negative and the coefficient is nevertheless quite substantial Standard errors clustered by country Robust standard errors in parentheses ***p < 0.01; **p < 0.05; *p < 0.1

3
Elite violence and elite numeracy in Europe from 500 to 1900…

Table 11
Spatial fixed effects regressions: spatial autoregressive (SAR) and spatial error (SEM) models Birth known Elite violence and elite numeracy in Europe from 500 to 1900…  Numbers in bold text indicate coefficients that are significant at least at the 10% level The reference category for institutional factors is hereditary succession; for 'second serfdom', it is the regions and periods not affected by the experience Robust standard errors in parentheses ***p < 0.01; **p < 0.05; *p < 0.1

Table 12
Spatial fixed effects regressions: spatially lagged X model (SLX) Birth known (1) (3) Elite violence and elite numeracy in Europe from 500 to 1900… Numbers in bold text indicate coefficients that are significant at least at the 10% level The theta (Θ) columns indicate the coefficients for each spatially lagged independent variable. This shows that the spatial independent variable spillovers from other countries are insignificant, while the direct effect of the regressors from within countries can be interpreted as usual from the columns labelled slx The reference category for institutional factors is hereditary succession; for 'second serfdom', it is the regions and periods not affected by the experience Robust standard errors in parentheses ***p < 0.01; **p < 0.05; *p < 0.1 Table 12 (continued) Birth known country centroids; β is a vector of ordinary regression coefficients; and ρ, Θ and λ are coefficients of the spatial characteristics described below. The SAR model controls for the direct effect that variation in the dependent variable of other countries may have on country i (measured by ρ) i.e. the effect of elite numeracy spillovers from neighbours. Likewise, the SLX model controls for spillover effects from the independent variables of other countries (measured by Θ), such as the effect of neighbouring elite violence on elite numeracy in country i. Last, the SEM model controls for any effect that unexplained variation from other countries may have on elite numeracy in country i (measured by λ), such as the effect of an omitted variable. While more complex models can be estimated, these often suffer from multicollinearity, or else fail to converge (Burkey 2017). 18 Additionally, our estimates of ρ, Θ and λ from each of these simpler specifications indicate that spatial correlation is not very influential in our analysis (Tables 11,12).
Our results show similar coefficients for regicide and battles, although these are surprisingly somewhat larger (in absolute terms) than those from the fixed effects specification in Sect. 6.1 (Eq. 1, Table 4); between approximately − 0.6 and − 0.8 for regicide, and − 0.75 to − 0.9 for battles. Further, the coefficient for urbanisation is positive and significant, between 0.5 and 1.0, and while no other coefficients are significant in the SAR and SEM models, additional coefficients in the SLX model turn out significant. The SLX model shows a positive and significant coefficient of approximately 0.05 for more participative succession systems, while the coefficients for pasture and crop areas fall in line with the fixed effects results, although they are only approximately half as large. The regicide and battle coefficients may be larger, partially because none of the spatial models converged when time fixed effects were also included, leading to their unfortunate omission. However, in order to ensure that the omission of time dummies is not driving our results, we run all three spatial models in first differences, bringing our results more in line with those from the fixed effects specification from Eq. 1. Under first differences, each of the models yields regicide and battle coefficients that are approximately 30-40% smaller than under Eq. 1, while pasture and crop areas provide similar trends. In addition, the SLX model shows a negative and significant coefficient of approximately − 0.15 for the second serfdom dummy.
Although the results from these spatial regressions provide undoubtedly interesting interpretations, they are remarkably similar to those from the fixed effect model (Eq. 1). Additionally, the Θ parameter is never significant, and the ρ and λ parameters are insignificant in all but a few specifications. This leads us to believe that despite limited evidence of dependent variable and error term spillovers across countries, spatial autocorrelation is not a notable source of endogeneity in this study (Tables 13, 14).

3
Elite violence and elite numeracy in Europe from 500 to 1900…

Table 13
Spatial fixed effects in first differences: spatial autoregressive (SAR) and spatial error (SEM) models ΔBirth known (1)   Elite violence and elite numeracy in Europe from 500 to 1900…  Numbers in bold text indicate coefficients that are significant at least at the 10% level The reference category for institutional factors is hereditary succession; for 'second serfdom', it is the regions and periods not affected by the experience Robust standard errors in parentheses ***p < 0.01; **p < 0.05; *p < 0.1

Table 14
Spatial fixed effects in first differences: spatially lagged X model (SLX) ΔBirth known (1) (3) Elite violence and elite numeracy in Europe from 500 to 1900…  The theta (Θ) columns indicate the coefficients for each spatially lagged independent variable. This shows that the spatial independent variable spillovers from other countries are insignificant, while the direct effect of the regressors from within countries can be interpreted as usual from the columns labelled slx The reference category for institutional factors is hereditary succession; for 'second serfdom', it is the regions and periods not affected by the experience Robust standard errors in parentheses ***p < 0.01; **p < 0.05; *p < 0.1

Appendix 4: Using predicted values
To test whether collinearity between our variables that could potentially alleviate bias (from Table 3) and variables of interest has any effect on the relationships we obtained, we run a regression specification using predicted values for elite numeracy. We first regress elite numeracy on our variables that could potentially alleviate bias before regressing the predicted values from this regression on our variables of interest. Here, we see that our core results concerning elite violence, battle deaths, crop area and pasture area remain intact, and that no changes in signs or significance occur (Table 15).   Appendix 5: Changing the spatial unit of observation Next, we implement another robustness test by changing our spatial unit of observation from modern countries to the broader regions specified in Table 7. Again, our key findings remain largely unaffected, although neither the pasture nor the crop variables become at all significant; the second serfdom now has a negative and significant impact (Table 16).  Tables 18 and 19 show the first stage regressions to the IV regressions from Tables 5  and 6, respectively.

Appendix 9: Random effects specification with time-invariant factors
As a further robustness test, we also apply a random effects specification because it does not eliminate the confounding effects of omitted time-invariant factors. These controls first include variables concerning religion. Although religion is not perfectly time invariant, there are not many examples of major religious changes within European kingdoms that occur on a mass scale after the collapse of the Roman Empire. Major religious changes that occurred include the Great Schism between the Catholic and Orthodox Churches in the eleventh century, the Protestant Reformation, the spread of Islam under the Ottoman Empire, and the Arab-Berber conquest and Reconquista in Spain. We coded the majority religion using the ruler's religion from our regicide sources and the summaries of historical religion in the Encyclopaedia Britannica.
Our first additional variable for the random effects specification is an indicator of the most prominent religion in each country during each century-Islam, Orthodoxy, Protestantism, Catholicism (the reference group) and an 'other' category; comprising Pagan, tribal and pre-Christian religions. This indicator variable was included to capture the effects of cultural characteristics that are associated with religion. We find similar levels of numeracy across Catholicism, Protestantism and Islam, with some evidence of lower levels for Orthodoxy and our 'other' category.

Fig. 19
Regicide versus nobilicide (nobilicide from battles). Note Centuries are rounded up and abbreviated, i.e. 15 refers to the fifteenth century. Regional disaggregation follows Cummins (2017) (2017) Surprisingly, despite numerous results from previous literature, Protestantism is not associated with higher levels of numeracy (see Woessmann (2009, 2010) for an alternative expectation).
We also include a dummy for religious diversity (Baten and van Zanden 2008). This could have either have had a positive effect on numeracy, perhaps via competition-stimulating book consumption, for example-or a negative effect via conflict through social fractionalisation (Easterly and Levine 1997). However, we find no evidence of an effect at all.
Our final religious variable is a dummy for the presence of a substantial Jewish minority, which we include because Jews were, on average, better educated than other religious groups among whom they lived. These data are from a combination of Anderson et al. (2017), Botticini and Eckstein (2012) and the Encyclopaedia Judaica. This dummy provides a positive and significant association with elite numeracy of approximately 7-13%.
The rest of our new controls for the random effects model are geographic and wholly time invariant. We use ruggedness because numerous studies have associated it with violence and lower economic development in a broader sense. For example, Mitton (2016) finds flatter landscapes to be associated with higher GDP per capita, while Bohara et al. (2006 and Idrobo et al. (2014) all describe different situations where rugged terrain provides advantages for instigators of violence. In contrast, Nunn and Puga (2012) describe how ruggedness protected parts of Africa from the adverse effect of the slave trade between 1400 and 1900. The ruggedness data that we use come from Nunn and Puga (2012). As spatial controls, we again include latitude and longitude for each country. Next, we use the percentage of each country that is covered by fertile soil and the percentage of each country that lies within 100 km of ice-free coast. Both variables come from Nunn and Puga (2012) and control for any additional agricultural effects or the effects that maritime trade may have had on elite numeracy, respectively.
The random effects regressions also show largely similar results to the initial fixed effects specification, although the sizes of the coefficients differ modestly. The coefficients for elite violence are approximately 10-20% smaller under random effects, whereas those for battle deaths are between 5% and 15% larger. These variables both remain consistently negative and significant across specifications. Likewise, the coefficients for pasture and crop areas are approximately 40% smaller, though this is somewhat due to multicollinearity after the inclusion of the soil fertility variable. The soil fertility variable is frequently significant at the 10% level, though it is negative like the crop area variable. The fertile soils of Southern and Eastern Europe were often used for grain production, whereas the less fertile Northern European soils were more often used for cattle farming. During later periods, higher elite numeracy developed in Northern Europe (Table 20).    In order to estimate elite numeracy, we employ the share of rulers for whom a birth year is reported in conventional biographical sources. We propose that for the birth year of a ruler to be entered into a kingdom's historical records, a certain level of numerical sophistication is required among the ruling elite. This evidence does not necessarily estimate the numerical ability of the rulers themselves but rather that of the government and bureaucratic elite around them and, by implication, the elites of the polity in general.
As more traditional indicators of education such as literacy rates, school enrolment, or age heaping-based numeracy are not available for most medieval European countries, only the 'known ruler birth year' proxy allows us to trace elite numeracy in periods and world regions for which no other indicators are available.
The data for the elite numeracy measure come from our regicide data set, which was initially built using the rulers found in Eisner's (2011) original regicide study, comprising 1513 rulers from across 45 kingdoms. We then strongly expanded this data set with an array of supplementary sources, chiefly Morby's (1989) 'Dynasties of the World' and Bosworth's (1996) 'The New Islamic Dynasties' as well as many other individual biographies and encyclopaedia entries. The expanded data set consists of 4066 rulers from 92 kingdoms across the period 500-1900 CE and comprises all of Europe (see Keywood and Baten 2018 for more details).

Elite violence
Elite violence could potentially be an important determinant of elite numeracy. If the risk of being killed were high, elite families would likely have substituted some of their children's education for military training or instruction in selfdefence. Similarly, elites surrounding the ruler would have been selected based on criteria concerning strategic combat and defence rather than on sophisticated skills in negotiation and trade. Additionally, violence may have prevented students from travelling to educational facilities, and these institutions may even have been destroyed through violent acts.
We use the regicide rate as our indicator for elite violence after comparing evidence on regicide and homicide for a number of European countries for which Eisner (2014) presented early evidence of homicide. The data for the elite violence variable come from our regicide data set.

Battle Violence
Battle violence provides information on civil wars and external military pressures on each kingdom, which may have affected elite numeracy through the destruction of educational infrastructure or lowered incentives to invest in elite numeracy due to lower life expectancy (Cummins 2017). Moreover, battle deaths and regicide are correlated, meaning that not including them as a control variable could lead to an overstatement of any effect of regicide on elite numeracy. Consequently, because we aim to use regicide as a proxy for interpersonal violence, we must differentiate between it and violence stemming from external sources. The data for the battle violence variable come from our regicide data set.

Urbanisation
Urbanisation rates are widely used in economic history literature, and act as a broad control variable for factors that could confound the relationship between elite violence and elite numeracy. They have also been employed as a proxy indicator for income among early societies in which other income proxy data are unavailable (Bosker et al. 2013;De Long and Shleifer 1993;Acemoglu et al. 2005;Nunn and Qian 2011;Cantoni 2015). Bosker et al. (2013) hypothesise that part of this relationship works through agricultural productivity because a productive agricultural sector is required to support a large urban centre, and urban areas cannot produce their own agricultural goods. We constructed our urbanisation variable using Bosker et al.'s (2013) estimates of urban populations and calculated urbanisation rates using McEvedy and Jones' (1978) measurements of country populations by century.

Institutional quality
We also introduce a measure of institutional quality as a potential determinant of elite numeracy. Our indicator is the mode of succession of rulers, as this captures a preference for the division of power and the willingness to forego executive decision-making in the interests of democracy. We use a three-category indicator to describe whether a ruler obtained their position through inheritance, partial election or full election by the nobility or a business aristocracy (as in Venice, for example). The differences in institutional quality between states, seen through modes of succession, is not as large as those between democracy and autocracy, of course, but evidence on democratic structures does not exist for the first centuries under study here. However, a preference for the division of power reduces the likelihood of unconstrained totalitarianism. We expect institutional quality to be positively correlated with elite numeracy. The data for the institutional quality variable come from our regicide data set.

Pastureland
Next, we use estimates of pastureland area from Goldewijk et al. (2017). We transform the variable to pastureland per square kilometre per capita. Motivation for including this control is that pastureland provides nutritional advantages, and improved nutrition is known to have positive implications for human capital (Schultz 1997;Victora et al. 2008). Second, numerous studies have used pastureland and pastoral productivity as means of estimating female labour force participation, which is lined to female autonomy gender inequality, human capital and numeracy as a result (Alesina et al. 2013;Voigtländer and Voth 2013;Baten et al. 2017). This mechanism functions through women's comparative physical disadvantage relative to men when ploughing fields and performing other tasks required when crop farming. Over time, this tendency developed into a social norm that saw men work in the fields while women took care of 'the home' (Alesina et al. 2013). However, when cattle and other domestic animals were present, their care became the task of women-boosting female labour participation and their contributions to household income, thereby increasing female autonomy and reducing gender inequality-allowing women to develop skills in human capital and contribute to economic development (Diebolt and Perrin 2013).

Cropland
As a counterweight to the pastureland variable, we use cropland as a comparative indicator. Like pastureland, cropland should describe agricultural and nutritional development but should also emphasise gender inequality for the reasons above. Therefore, its coefficient should be positive if nutrition, in terms of calories, is more important for elite numeracy, and negative if gender inequality is. The cropland variable is also transformed into per square kilometre per capita terms; and comes from Goldewijk et al. (2017).

Second serfdom
We include a variable for the second serfdom to assess whether the inequality that it wrought had any impact on elite numeracy in Eastern Europe. This is coded as a dummy variable for all of Eastern Europe from the sixteenth until the eighteenth century and until the nineteenth century in Russia, where serfdom was only officially abolished under Tsar Alexander II in 1861.

Nomadic invasions
We use the nomadic invasions of Europe from Central Asia as an instrument for elite violence because they resulted in an external import of violence to Europe. Additionally, nomadic invasions meet the exclusion restriction their origins were determined by climatic forces, such as droughts in Central Asia (Bai and Kung 2011), and by military capacity. To estimate the impact of these invasions, we use the logged inverse distance of each kingdom's capital to Avarga, Mongolia, the location of the first capital of the Mongolian Empire.

Length of reign
The next three variables are used to control for ruler specific characteristics, labelled 'elite controls' in the text. First, rulers who spent more time on the throne could have better established themselves and their policies, giving chronologists more reason 1 3 and more time to document their birth years. We control for this potentially biasing effect by including the length of the ruler's reign as a control variable. The data for the reign length variable come from our regicide data set.

Fame of ruler
Second, the birth years of more famous rulers might have been better recorded. It is conceivable that events in the lives of lesser rulers, who were placed under the suzerainty of an emperor, for example, would be less diligently documented. We can also control for this 'fame bias' to a certain extent by controlling for whether the rulers of each kingdom were always under the suzerainty of an overlord, whether this applies to a part of each period, or whether it was never the case. Rulers with a more dependent, governor-type function most likely attracted less attention from chronologists than those who had the freedom to act and set policy autonomously. The data for the ruler fame variable come from our regicide data set.

Power of ruler
We include the area of each kingdom in square kilometres as a third control variable against more famous or powerful rulers being better documented. Although not all powerful rulers held large territories, rulers of powerful kingdoms such as the Holy Roman Empire, the Ottoman Empire, Poland-Lithuania and the Kievan Rus certainly did. The data for the ruler power variable come from Nüssli (2010).

Religion
As an additional variable for the random effects specification we use the most prominent religion in each country during each century-Islam, Orthodoxy, Protestantism, Catholicism (our reference group) and an 'other' category; comprising Pagan, tribal or pre-Christian religions. This indicator variable was included to capture the effects of cultural characteristics that are associated with religion. We coded the majority religion by using the ruler's religion from our regicide sources and the summaries of historical religion in the Encyclopaedia Britannica.

Religious diversity
We also include a dummy for religious diversity from Baten and van Zanden (2008). This could have either a positive effect on numeracy, perhaps via competition-stimulating book consumption, for example-or a negative effect via conflict through social fractionalisation (Easterly and Levine 1997).

Jewish minority
Our final religious variable is a dummy for the presence of a substantial Jewish minority, which we include because Jews were, on average, better educated than