Human Nature

, Volume 25, Issue 4, pp 517–537 | Cite as

Surnames and Social Mobility in England, 1170–2012

  • Gregory Clark
  • Neil Cummins


Using educational status in England from 1170 to 2012, we show that the rate of social mobility in any society can be estimated from knowledge of just two facts: the distribution over time of surnames in the society and the distribution of surnames among an elite or underclass. Such surname measures reveal that the typical estimate of parent–child correlations in socioeconomic measures in the range of 0.2–0.6 are misleading about rates of overall social mobility. Measuring education status through Oxbridge attendance suggests a generalized intergenerational correlation in status in the range of 0.70–0.90. Social status is more strongly inherited even than height. This correlation is unchanged over centuries. Social mobility in England in 2012 was little greater than in preindustrial times. Thus there are indications of an underlying social physics surprisingly immune to government intervention.


Social mobility Intergenerational correlation Status inheritance 

Since the pioneering work of Francis Galton and Karl Pearson, there has been interest in how strongly children inherit parental characteristics—the “Laws of Inheritance” (Galton 1869, 1889; Pearson and Lee 1903). In this paper we tackle this issue afresh, using status information from surnames to estimate the intergenerational correlation of social status across multiple generations. The data we use are for educational status in England from 1170 to 2012, but similar results can be found for other measures of status such as wealth or occupation, and for other countries (see Clark and Cummins 2014b; Clark et al. 2014). By social status we mean the overall ranking of families across such aspects of status as education, income, wealth, occupation, and health.

Conventional estimates put the correlation between parents and children of the components of status at 0.3–0.5 in England, both in recent generations and in the nineteenth century (Atkinson et al. 1983; Corak 2012; Dearden et al. 1997; Ermisch et al. 2006; Harbury and Hitchens 1979; Hertz et al. 2007; Long 2013). The intergenerational correlations of income and education in England fall within the average of those observed internationally (Corak 2012; Hertz et al. 2007). These correlations imply rapid regression to the mean of family socioeconomic characteristics across generations. They also imply that parental characteristics explain only a quarter or less of the variance in child outcomes. These correlations have been assumed to represent overall social mobility rates. If social mobility is a first-order Markov process, the same across each generation, these intergenerational correlations imply that the expected status of most elite and disadvantaged families will converge to the societal mean within 3–5 generations. Class structure will not persist across many generations, at least in modern societies.

Here we estimate from surnames the intergenerational correlation of educational status in England over the course of the years 1170 to 2012, comprising 28 generations of 30 years each. Since the medieval period, surnames in England in any generation were mainly derived from inheritance. Thus, if family statuses quickly regress to the mean, so should average surname statuses. But surnames retain status strongly over many generations. For people grouped by surname, the persistence of educational status is much higher. Surnames reveal the intergenerational correlation of educational status in England to be in the range of 0.73–0.9. Measured in this way, educational status is more strongly inherited even than height.1 Initial status differences in surnames can persist for as many as 20–30 generations. Even more remarkable is the lack of a sign of any decline in status persistence across major institutional changes, such as the Industrial Revolution of the eighteenth century, the spread of universal schooling in the late nineteenth century, or the rise of the social democratic state in the twentieth century. Status persistence measured in this way is just as strong now as in the preindustrial era.

We postulate that the surname correlations are much higher than conventional estimates because families have an underlying social status that is changing slowly across generations in a first-order Markov process. That is, underlying status is regressing to the mean at a constant rate. In practice we observe aspects of status such as education, occupation, and income. These individual aspects of status are linked to underlying status with random components. A family of high underlying social status can for accidental reasons appear high or low in status in terms of individual aspects, such as education. As will be shown, the surname estimates measure the correlation of underlying social status across generations. Because of the random components, aspects of social status have lower intergenerational correlations than underlying social status and give biased estimates of true rates of underlying social mobility. An implication of the postulated structure of social mobility is that social mobility rates measured from surnames will be the same for any aspect of status. We show that the intergenerational correlation of wealth for surnames between 1830 and 1966 is indeed 0.78, similar to that for education.

Measuring Social Mobility Using Surnames

Conventional estimates of social mobility rates measure the correlation between individual parents and children of such aspects of social status as income, wealth, education, occupational status, longevity, or height. Specifically, if we measure an aspect of social status as a cardinal number y, where y is normalized to have a mean of zero and a constant variance across generations, then the intergenerational correlation of this measure of status will be obtained by estimating the value of β in the following equation:
$$ {y}_{it} = \upbeta {y}_{it-1} + {u}_{it} $$

where i indexes the family, t indexes the generation, and u it is a random shock. β will typically lie between 0 and 1, with lower values of β implying more social mobility. β is thus the persistence rate for status, and 1 − β the social mobility rate. β also estimates the share of the variance of each status measure in each generation that is explicable from inheritance. This share will be β2.

The intergenerational correlation estimated using Eq. (1) requires that we know the link between individual parents and children. However, in England most people by 1300 had surnames, and these surnames were inherited mainly through the patriline.2 Men bearing the surname Bigge born in England in 1900–1929, for example, will mostly be descended from someone in the group of men bearing the surname Bigge in 1870–1899. Thus using surnames to group people we can identify groups of sons who collectively descended from a group of fathers, without knowing the exact descent relationships. Thus instead of estimating β from
$$ {y}_t = \upbeta {y}_{t-1} + {u}_t $$
we can use
$$ {\overline{y}}_{kt} = \upbeta {\overline{y}}_{kt-1} + {\overline{u}}_{kt} $$

where k indexes surname groups and \( \overline{y} \) indicates averages across surname groupings. We can, for example, compare the average status of everyone born with the surname Bigge in 1800–1829 with those born with this surname in 1830–1859. The 30-year interval between the time periods represents the assumed average length of a generation.

This averaging across surnames would be expected to produce an attenuated estimate of the β linking fathers and sons for several reasons. First, we have to take all those born with a class of surnames in a time interval (t, t + n) and compare them with those born in the time interval (t + 30, t + n + 30), the 30 years representing the average interval between generations. This introduces error in that some children of the generation born in the interval (t, t + n) will not be born in the interval (t + 30, t + 30 + n). And some of those born in the interval (t + 30, t + 30 + n) will have fathers not born in (t, t + n). Second, the surname method counts those in (t, t + n) who have no children equally with those who have large numbers of children. Third, there will potentially be some adopted children among the younger generation, as well as those who changed surnames from their birth surname. For all these reasons the surnames can only provide an imperfect estimate of the average of the actual parent child status linkages. This imperfection should bias the surname estimates of status persistence towards zero.

However, when we make such estimates grouping people by surnames of initial average high or low status, we always find that the intergenerational correlation of status measured in this way is much higher than that measured at the level of individual families (Clark et al. 2014). For example, in England for 1858–2012 we have measures of wealth at death for a set of people with rare surnames who on average have high or low wealth in the first generation (1858–1887). In this case, because the surnames are rare, we can also establish many of the individual links between parents and children. The individual correlation averages 0.43 across these years. But the correlation of wealth at death across generations measured through groups of surnames is 0.74 (Clark and Cummins 2014b).

We reconcile this discrepancy in the following way. We must distinguish between measures of a family’s surficial or apparent social status and their deeper social competence, which is never observed directly.3 What is observed for families is their attainment on various partial indicators of social status: earnings, wealth, occupation, education, residence, health, longevity. Each of these derives from underlying status, but with a random component. Thus the proposed model of status transmission is
$$ {y}_t = {x}_t + {u}_t $$
$$ {x}_t=b{x}_{t-1}+{e}_t $$

where x t is the family’s underlying social competence, u t is the random component, and b is the intergenerational correlation of underlying status.

The random component of aspects of social status exists for two reasons. First, there is an element of luck in the status attained by individuals. Second, people sometimes sacrifice aspects of status, such as income and wealth, for other aspects, such as education or occupational prestige. University professors are a classic example of this trade-off.

The above implies that the conventional studies of social mobility, based on estimating the intergenerational correlation β in the relationship
$$ {y}_{t+1}=\upbeta {y}_t+{v}_t $$

for various partial measures of status—earnings, wealth, education, occupation, and so on—underestimate the true intergenerational correlation b that links underlying social status across generations. In particular, the expected value of the conventional estimate β is not the underlying b but instead θb, where \( \uptheta = \frac{\sigma_u^2}{\sigma_x^2 + {\sigma}_u^2} \) is less than one. Further, the greater the random components of any measured aspect of status, the smaller θ will be.

Equations (3) and (4) involve a number of strong simplifying assumptions. Equation (3) assumes, for example, that social mobility rates are the same across the whole of the status distribution, from top to bottom. But the empirical evidence is that this assumption is not too far from reality. Also, if (3) and (4) correctly describe the inheritance of social status in any society, then in the long run any measure of status will show a normal distribution across the population.

Since we have these two measures—b for underlying social mobility and β for partial measures of status—why is it that the underlying b is the true rate of social mobility? The reason is that if we were to measure families’ status by an average of the various observed aspects of status, \( {\overline{y}}_t \), then
$$ {\overline{y}}_t = {x}_t+{\overline{u}}_t $$

where \( {\overline{y}}_t \) indicates an average of the various random components. But as we average status across many aspects—earnings, wealth, residence, education, occupation, health, longevity—the average error component shrinks toward zero. Thus the intergenerational persistence of average measured social status lies somewhere between b and β. The underlying b gives us potentially a better measure of the persistence of status on average for families, as opposed to the persistence of any particular aspects of status. Also, if we want to predict the correlation of any particular measure of status over n generations, its expected value will be θb n .

When we consider the social mobility of large groups of people identified by race, religion, national origin, or even surnames (where whether a surname belongs to a high- or low-status surname grouping has been identified in some earlier generation), the measure b will unambiguously be the one that reveals their rate of social mobility. For now, at the group level, \( \overline{y}=\overline{x} \).

Now the \( \overline{y} \) accurately tracks \( \overline{x} \) without the intrusion of the errors, and we can correctly estimate underlying social mobility. When we look at such groups of individuals, the underlying, slow rate of social mobility becomes apparent even when we can observe only the usual partial indicators of underlying social competence. This is why the surname groups provide a measure of underlying rates of social mobility. But any grouping that is independent of the current random elements determining a partial measure of status will do the same. That is why it will always seem that racial, ethnic, and other minorities within societies experience slower than expected social mobility.

Thus, even though here we examine only educational status in England from 1170 on, the estimates of social mobility will actually be those for the intergenerational correlation of underlying social status, as long as the assumptions of the model above hold.

An important feature of the data used in this paper is that we have for each generation not average educational attainment, but the fraction of each surname group in each generation from 1170 to 2012 that attended Oxford or Cambridge.4 We convert that into an estimated mean educational status by generation by surname group using the following assumptions.
  1. (a)

    Oxford and Cambridge represent the top x% of the educational status distribution, where x is measured in each cohort by the share of males in England and Wales who attend these universities. Before 1832 Oxford and Cambridge were the only English universities and, since that time, are the most selective in their admissions.

  2. (b)

    Educational status is normally distributed with constant variance.

  3. (c)

    Elite or lower-class surname groups have the same variance of educational status as the population as a whole among its members, but just a higher (or lower) mean.

These assumptions imply that we can infer the mean status of any group in each generation from the relative share of the surnames of that group among Oxbridge students compared with the percentage among the population as a whole (Fig. 1). The key statistic we focus on is the relative representation of any group of surnames among the elite; this is given for surname group z as
Fig. 1

Regression to the mean of surname status. The strength of the intergenerational correlation in status, b, can be measured by the speed of decline of the overrepresentation of initial elite surnames among social elites

$$ \mathrm{relative}\ \mathrm{representation}\ \mathrm{of}\ z = \frac{\mathrm{Share}\ \mathrm{of}\ z\ \mathrm{in}\ \mathrm{elite}\ \mathrm{group}}{\mathrm{Share}\ \mathrm{of}\ z\ \mathrm{in}\ \mathrm{general}\ \mathrm{population}} $$

Given this relative representation, and an estimate of what population share constitutes the elite, we can estimate an implied mean status for each group given assumptions a–c listed above. To give a concrete example, in 1830–1859, 0.62% of all males reaching age 18 attended Oxford or Cambridge. A group of rare elite surnames constituted 11.9% of Oxbridge students but only 1.2% of the population in that cohort. This implies that these surnames had a relative representation of 10 among Oxbridge students. In turn this implies that the mean educational status of this group was 1.05 standard deviations above the social mean, in order to produce this level of overrepresentation.

Given the estimated mean educational status estimated in this way in each generation, we can estimate the intergenerational correlation of status from
$$ {\overline{y}}_{kt}={\overline{x}}_{kt} = b{\overline{y}}_{kt-1} = b{\overline{x}}_{kt-1} $$

For the example above, in the next generation, 1860–1889, the estimated mean educational status of the rare surnames declined to 0.76 standard deviations above the mean, which implies from Eq. (7) that the underlying intergenerational status is 0.72. Note that even though it is estimated from educational status, the estimated intergenerational correlation here is an estimate of the underlying rate of social mobility.

The Surname Data

To measure the average social status of surnames we use as an indicator the relative representation of surnames at Oxbridge in 1170–2012. For the average surname this ratio, the relative representation, will be 1. For high-status surnames it will be greater than 1, and for low-status surnames, less than 1.

We have information on the relative frequency of surnames in the population from 1538 to 2005 from a variety of sources, including censuses and records of births and marriages. These sources are listed in the Appendix.

We also have a database containing the surnames of students who attended Oxford and Cambridge between 1170 and 2012. For the years before 1500 the database includes the names of faculty as well as students. For Oxford in 2010–2012, the structure of the e-mail directory makes it impossible to exclude some faculty names. The incompleteness and informality of records at Oxford and Cambridge in earlier years, and the imperfect sources for later years, such as exam results lists, mean that the database is necessarily just a sample of those attending the universities.

Table 1 shows the people identified as attending Oxbridge in each generation, assumed to be 30 years. For earlier years, this is a sample of those attending the universities. From 1530 to 1892 this is a nearly complete list of all matriculating students; for 1892–2009 the data is once more just a sample of all attendees. The third column shows the estimated total numbers of students in each generation. For 1170–1469 the share attending Oxbridge is assumed to be 0.8% of each male cohort. This is similar to the shares observed for 1470–1499, and it is 4–5 times the observed shares prior to 1440. But the limited data for these years means that only a fraction of attendees were observed.5
Table 1

Surnames at Oxbridge, 1170–2012


Oxbridge students observed

Estimated total Oxbridge students

Assumed domestic share

Population students drawn from

Oxbridge cohort share (%)













































































































































































The fourth data column in Table 1 is the number of those surviving to age 16 in each generation from which the student population was drawn. Before 1870 this population is assumed to be males only. Thereafter an increasing number of females attended the university, until it is assumed that by 1990 that all males and females aged 16 are potential Oxbridge attendees.

In later generations, increasing numbers of Oxbridge students have been drawn from outside England and Wales. For 1980–2012 the Oxford University Gazette summarizes the fraction of students drawn from outside England and Wales (annual editions). Cambridge has similar statistics for 2000–2010 (Cambridge University Reporter).

Thus in 2012 only 62.3% of Oxford students were domiciled in England and Wales. In 2010 the equivalent numbers for Cambridge are 61.9%. However, many students from outside England and Wales were drawn from populations that contained substantial numbers of immigrants from England and Wales: Scotland, Northern and the Republic of Ireland, the USA, Canada, Australia, New Zealand, and South Africa. These students constituted 14.4% of the Oxford student population in 2012. The equivalent number for Cambridge in 2010 is 10.5%.

We thus took the “English” surname share at Oxbridge as 62% in 2010–2012, and 76% in 1980–2009. We project these foreign surname shares backward by measuring the share of typically German, Swedish, Dutch, Spanish, Italian, Chinese, and Indian surnames at Oxbridge in 1800–1979.

The final column in Table 1 shows the implied share of the eligible population attending Oxbridge. From 1470 to 2012 this has varied. At its peak in 1560–1589 it was 2.2%; at its minimum in 1890–1919 it was 0.5%.

A generation is taken to be 30 years. Some studies have assumed a generation as short as 20 years for preindustrial society. But in England from 1538 on the average women had her first child at age 25 or later, and the average man at age 27 or later, so the average interval for a generation would be around 30 years. If the generation length is actually shorter, then true social mobility rates will be slower.

In England in 1300 surnames varied substantially in average social status. Surnames were first adopted by the upper classes. The Domesday Book of 1086 records surnames for many major landholders, these being mainly the Norman, Breton, and Flemish conquerors of England in 1066. Their surnames derived mainly from the home estates of these lords in Normandy. They have remained a distinctive class of surnames throughout English history. Many are still well known: Baskerville, Darcy, Mandeville, Montgomery, Neville, Percy, Punchard, and Talbot. “Norman” surnames were identified as a sample of the surnames of landlords in the Domesday Book identified by Keats-Rohan as deriving from place names in Normandy, Brittany, or Flanders (Keats-Rohan 1999). All possible derivations from these original surnames were included.

Another, later set of vintage high-status surnames consists of landholders listed in the Inquisitions Post Mortem (IPM) of 1236–1299 (Public Records Office 1904, 1906), which are enquiries into the successors of the feudal tenants of the king. Among these property owners were many with relatively rare surnames of more recent (post-1066) English origin, again mainly deriving from the location of their estates: Berkeley, Pakenham, etc. (“Relatively rare” in this case meant surnames held by less than 10,000 people in 1881.)

Lastly, locative surnames, those that identified a person by their place of origin, such as Atherton, Puttenham, and Beveridge, were typically of relatively high status in 1300. At the time of their creation, these locative surnames, such as Roger de Perton (later, Roger Perton), implied the possessor operated in the larger world outside the rural villages that dominated medieval life. They were merchants, traders, attorneys, priests, civil servants, and soldiers. Although such surnames must originally have been a modest share of all surnames, they now constitute at least a quarter of all surnames of English origin. Locative surnames were identified as those ending in -ton, −don, −dge, −ham, −land, −bury, and variants such as -tone, −tonn, −tonne, −tun. Hyphenated surnames containing at least one of these surnames were included only if the locative surname was the last component.

Surname spelling was not standardized in England before the late eighteenth century. The modern Smith, for example, evolved from one of four medieval spellings—Smith, Smithe, Smyth, and Smythein the seventeenth and eighteenth centuries (Fig. 2). But also surnames mutated from their original forms when the earlier meaning was lost. This stems partly from elite surnames moving down the social ladder across generations because of social mobility, with the name eventually being borne by illiterates ignorant of the surname meaning. The occupational surname Arbalistarius, for example, recorded in the Domesday Book and derived from the Latin arcus (bow) and ballista (catapult), has no meaning to those without a Classical education. Thus it mutated into the modern forms Arblaster and Alabaster. So in looking at the frequencies of these medieval surnames across generations, we include spelling variants and derived surnames. From the nineteenth century onward, the English also created new surnames by compounding them. Thus we get Cave-Brown-Cave, Fox-Strangways, and even Baker-Baker and Lane-Fox-Pitt-Rivers. We include in the selected surnames any names derived by such compounding.
Fig. 2

“Smith” variants among marriages, 1538–1859. From the sixteenth century to the nineteenth, names became more homogeneous and standardized, as illustrated here by the disappearance of Smi(y)th(e) variants

The process of social mobility, however, means medieval high-status surnames lost most of their status information over generations. Long-established surnames that occurred frequently in the population by 1800 had average social status. For later periods we can, however, identify rarer surnames that just by chance had on average acquired high or low status. We thus form, for example, a sample of the rare surnames of the socially successful by selecting the names of those matriculating at Oxbridge 1800–1829 that were held by 40 or fewer people in the 1881 census. The surnames in this list appear similar in character and perceived status to those not on the list. Table 2 shows 24 randomly chosen surnames from the beginning of this list of surnames occurring 0–40 times in the 1881 census of surnames and occurring in the records for Oxbridge in 1800–1829, compared with 24 randomly chosen surnames not on the Oxbridge list. Such surnames by themselves would not help determine the social position of bearers. Also, high-status individuals were not selectively adopting these surnames as a more socially fitting appellation.
Table 2

Examples of rare Oxbridge versus non-Oxbridge surnames, 1800–1829



















































Social Mobility Rates, 1830–2012

We define elite surname groups in 1800–1829 by selecting rare surnames of predominantly English origin found at Oxbridge in 1800–1829.6 Depending how rare the surnames are, we include progressively more elite surnames. Thus, the most-elite group used was surnames appearing 40 or fewer times in the 1881 census, and the least-elite group used was surnames found 500 or fewer times in the 1881 census.

To illustrate how we derive estimates of social mobility rates from these surname groups, Table 3 details how the intergenerational correlation of status, b, was estimated for the years 1830–2012 from the broadest sample, all surnames appearing at Oxbridge 1800–1829 for which 500 or fewer held the surname in the census of 1881.7 The share of the surnames at Oxbridge was calculated from the assumed share of the students at Oxbridge in each generation from England, as in Table 1, but with an allowance for some share of foreign students coming from countries such as New Zealand, where many surnames are of English origin. From the ratio of their share of Oxbridge graduates to their share of the population we get their relative representation in the Oxbridge elite.
Table 3

Calculating intergenerational correlation for the rare surnames (500 or fewer in the 1881 census)


Share of Oxbridge attendees (English surnames)

Share of population

Relative representation

Oxbridge elite (%)

Implied mean status

Implied b

















































We also know what share of each eligible cohort attended Oxbridge, which is assumed to be the top of the educational distribution. Given the relative representation, and the size of the Oxbridge elite, we calculate where the implied mean of the educational status of this group lies relative to the population, in standard deviation units. This is shown in the sixth column in Table 3. From this we can calculate a period-by-period implied b value, as is shown in the last column. Here the average b is 0.78.

But this simple average weights the observations in the early and later generations equally. Since the implied group mean of educational status is close to the social average, the estimates in later generations have less precision. An alternative procedure to estimate the overall b is to assume that, based on (7),
$$ {\overline{y}}_{t+n}={b}^n{\overline{y}}_t{\in}_{t+n} $$
$$ \ln {\overline{y}}_{t+n} = \ln {\overline{y}}_t + \ln (b).n + \ln {\in}_{t+n} $$
So just by estimating the coefficient h in the OLS best-fitting relationship
$$ \ln {\overline{y}}_{t+n} = g+h.n $$
we can estimate the best-fitting b for the whole set of observations, assuming it has a constant value. The b estimated in this way is 0.73, with 5% confidence bounds of (0.70, 0.76). This shows the estimate is relatively precise. As Fig. 3 shows, the R2 of this fit (0.98) is good. As witnessed by rates of representation at Oxford and Cambridge, rates of social mobility in England are persistently low, with little variation between generations.
Fig. 3

Estimated mean status of all rare surnames, 1830–2012. The mean status is shown in standard deviations from the social mean, in logarithms. The dotted line shows the best fit for a constant b across these 6.5 generations, which is b = 0.73

Figure 3 shows the persistence of social status across generations using a large group of rarer surnames. Some surnames in this group begin with very high average social status, some with a social status only modestly above average. Does the pattern of social mobility vary across levels of social status? Is downward mobility the same for those closer to the social mean as for those from much more elite families? To investigate this we can divide the elite rare surnames of 1800–1829 into those found 0–40, 41–100, 101–200, 201–300, and 301–500 times in the 1881 census. The rarer the surname group, the higher the average educational status.

Figure 4 shows the estimated mean status of these surnames for thirty-year student generations 1830–1859, …, 1980–2009, and 2010–2012. As expected, the rarer the surname, the higher the implied average status in the first period, 1830–1859. All surname groups show a steady regression toward mean social status. But four things stand out.
Fig. 4

Estimated mean status of surname groupings, Oxbridge, 1830–2012. Status is measured in terms of where (how many standard deviations above the mean) each group lies in terms of social status. The slope of each line measures social mobility rates

First, the rate of regression to the mean is very slow for all these surname groupings. The last period shows a lot of noise because of the smaller number of observations and the fact that some groups by then have regressed close to the mean. So Table 4 shows estimates of b for the periods 1830–2009. For all these groups the b is much higher than conventionally estimated. This means that even in 1980–2009, 150 years later, all these surname groups have a statistically significantly higher than average representation among Oxbridge students. Social status persists strongly.
Table 4

b estimates, 1830–2009


Surname holders, 1881




b 5%CI

Relative population share, 2010 vs. 1880

High status































Low status






* Based on the frequency of surnames in the 1881 census

There is a tendency for the higher-status surnames to show slower rates of regression to the mean. But this may be an artifact of how the relative representation of these surnames at Oxbridge is calculated. The expected baseline representation has to be estimated (because of the increasing presence of foreign students at Oxbridge in recent generations). If estimated too high, it will bias the measured rate of social mobility upward more for those surnames closer to the social average in their status. Certainly the relative social status of the five groups of surnames changes little over these 150 years.

The second striking feature is that the implied intergenerational correlation of status seems constant in 1830–2012. Social mobility does not increase with the emergence after the Industrial Revolution of modern social institutions, such as public education, mass democracy, and redistributive taxation.

The third striking feature is that social mobility seems to be a Markov process. The average status of the next generation depends only on that of the current generation, not on earlier history. These last two features are what led to the simple model of status transmission posited above in Eq. (3) and (4).

A simple law of social mobility, x ijt + 1 = bx ijt  + e ijt , seems to operate largely independently of the social institutions of the society. In England between 1830 and 2012 public provision of education expanded greatly. England was a laggard in northern Europe in public provision of education. Publicly provided education was only introduced in 1870, and education to age 10 did not become compulsory until 1880. The age at which children could leave school was raised to 11 in 1893, to 14 in 1918, and to 15 in 1944.

Local schools, however, played little role in Oxbridge entry in earlier years. Entry to Oxbridge was limited by a number of barriers to lower-class students before the 1980s. Oxbridge had its own special entrance exams until 1986. Until 1940, for example, the entry exams for Oxford included a test in Latin. Preparation for these exams was a specialty of a small number of elite secondary schools in England, many of them private, fee-paying institutions. In 1900–1913 nine schools, including Eton, Harrow, and Rugby, supplied 28% of Oxford students (Greenstein 1994). Only in the 1980s did the entry process equalize opportunities to students from all secondary schools.

Another barrier for lower-class students was the lack before 1902 of public support for university education. Oxbridge supplied some financial support, but most scholarships went to students from the elite schools that prepared them to excel in the scholarship exams. From 1920 to the 1980s, state support for secondary and university education greatly expanded.

We would thus expect more regression to the mean for elite surname frequencies at Oxbridge in the student generations of 1950 and later. There is no evidence of this in Figs. 3 and 4. The earlier surname elite persisted just as tenaciously after 1950 as before.

In the preceding analyses, we observed only downward mobility. Another class of surnames consists of those that do not appear at Oxbridge in 1800–1829. For a very rare surname, not appearing at the university in this window reveals little about its average educational status. But for more common surnames, having not even one holder at Oxbridge implies low average educational status.

We thus form a group of surnames held by 2001–5000 people in 1881 that did not appear at Oxbridge 1800–1829. In 1830–1859 these names had a relative representation at Oxbridge of only about one third of the average. Even by 2010–2012 these names had a relative representation of only 0.94. Figure 5 shows the path to the average of these names. Again there is an implied constant rate of regression to the mean across the generations, though with a somewhat lower estimated b of 0.62. Again, however, the estimates of social mobility rates are most robust the farther the social status of those surnames is from the average. And these surnames deviate only modestly from mean social status.
Fig. 5

Regression to the mean of low-status surnames, 1830–2012. Groups of surnames that appear at low rates at Oxbridge rise in status over time in a process that is close to a mirror image of the high-status groups. Both high- and low-status groups regress to the mean at slow rates

A fourth feature that emerges from Table 4 is that elite surnames have been in relative population decline since 1880. The more elite the name, the greater the decline. Fertility was lower for upper-class families, particularly in 1880–1960. Did upper-level groups maintain their social position by greater family limitation, and consequent greater child investments, than lower-class families? However, the persistence of elite surnames is as strong in 1830–1889, when fertility was as high for social elites as for the lower classes (Clark and Cummins 2014c). Again, changes in the correlation of fertility with social class have no effect on mobility rates.

Social Mobility, 1170–1800

We can estimate surname shares at Oxbridge back to 1170 for the three medieval elite surname groups. To estimate b we also need the population share for each surname. We estimate these from marriage records for 1538–1800. In preindustrial England, elite surnames tended to increase in population share over time as a result of the greater fertility of wealthier families (Clark and Cummins 2014c; Clark and Hamilton 2006). For 1170–1537 we thus project the surname share backward from that of 1538–1559. We assume the same average percentage change by generation as from 1560–1589 to 1650–1689. As Table 5 shows, the population share of these surnames increased between 1560 and 1680. So we are projecting a smaller share for 1290 than for 1530. That projection may be high or low, creating greater uncertainty about the earlier mobility estimates.
Table 5

Population share by surname type

Population share

Locative (%)

IPM (%)

Norman (%)


























aNumber in parentheses is projected population share based on the rate of growth of the share in 1560–1680

Candidate surnames on these lists that showed an unusual increase in frequency between 1881 and 2002, and where the surname was of foreign origin, including in this case Scottish and Irish surnames, were excluded. The aim was to have a set of surnames for which most of the holders in England and Wales in 2012 descended from the holders of 1800–1829.

For the period 1830–2012, population shares of surname groups for the rare surnames of 1800–1829 were estimated for four benchmark periods: 1837–1857, 1877–1897, 1965–1985, and 1985–1995. The 1837–1857 and 1877–1897 benchmarks were estimated from the national register of marriages for these years, since child mortality was still significant and differed by social class. The 1965–1985 and 1985–1995 benchmarks came from the birth register. The population share for 1830–1859 for Oxbridge was taken as the 1837–1857 benchmark, and that for 1860–1919 from the 1887–1897 benchmark. The population share in 1980–2009 came from the 1965–1985 benchmark, and for 2010–2012 from the 1985–1995 benchmark. Population shares for 1920–1979 were linearly interpolated from the shares in 1877–1897 and 1965–1985.

For the earlier surname elites, population shares for 1560–1589, 1680–1719, and 1770–1799 were estimated from parish marriage records as recorded in the International Genealogical Index (IGI 2013). For 1881 the share was estimated from the census (Schurer and Woollard 2000). For 2002 the share was derived from the Office of National Statistics (2002) database of surname frequencies in England and Wales. Population shares were linearly interpolated between these dates. Table 5 shows the resulting implied shares for the medieval surname elites.

Figure 6 shows the estimated relative representation of a set of locative surnames: those ending in “ton,” “ham,” “dge,” “bury,” “land,” and derivatives. At their peak these names represented 7.1% of all English surnames. These surnames rose in relative representation from 1170 to their peak in 1290–1319, when they were five times as common among Oxbridge attendees as in the general population. That representation has declined to the present, and it was within 10% of their population share by 1860–1889.
Fig. 6

Estimates of b, 1170–2012. Relative representation at Oxbridge of Norman surnames of landowners in the Domesday Book (Norman), Inquisition Post Mortem (IPM), and those of locative origin. The suggested underlying intergenerational mobility is consistent with the analysis of the rare names from 1800 to 1829

Assuming a constant intergenerational status correlation for 1290–2012, the best fitting b is 0.83. This is remarkable status persistence by modern standards. Remarkable again is the stability of b across different social eras. It is the same in the Middle Ages, when the universities were dominated by the Catholic Church, as after the English Reformation of 1534–1558, when a new, Protestant theology prevailed. There is no sign of enhanced mobility in the Industrial Revolution era of 1760–1860 despite the rise of new industries, and new wealth. For the modern period, mobility may be greater, but these names are so close to average status by 1860 that we cannot measure this.

A more elite set of medieval surnames is identified from a sample of the rarer surnames held by men who died in 1236–1299 and whose estates were subject to an Inquisition Post Mortem (IPM). Though identified purely through their wealth, these surnames peak in their relative representation at Oxbridge at the same time, in the years 1230–1259, when they are 30 times as common at the universities as their population share. Again, one b fits the IPM group 1230–2012 reasonably well, as Fig. 6 shows, though this one is even higher at 0.90. These surnames are still statistically significantly overrepresented at Oxbridge as recently as 1980–2009, 750 years after their peak.

Figure 6 suggests that b for the IPM surnames may be lower for 1800–2012. Estimated just for these years it is 0.81, which is still higher than the intergenerational correlation estimated for rare surnames at Oxbridge for 1830–2012. However, the IPM surnames declined in relative population share less than expected for elite surnames in 1880–2012. Possibly there has been adoption of these surnames by upwardly mobile families because of their elite connotations. Such adoption by entrants to the elite would slow the measured rate of social mobility. This suggests the more status-neutral locative surnames likely give better estimates of the true rates of social mobility before 1800.

The Norman surname sample shows even stronger persistence. These surnames persisted so strongly at Oxbridge, with a b of 0.93, that even in 2010–2012 they are statistically significantly overrepresented. Again there is sign of less persistence post-1800, with a b of 0.82. Once more, however, there is an unexpected maintenance of population share for these surnames in 1880–2012 (Table 5). Locative surnames’ population share declined 20% over this interval, but Norman surnames declined only 6%. Selective adoption of these surnames by entrants to the elite may have maintained the status of the surnames more than the status of the actual descendants of the original bearers. Again, the more status-neutral locative surnames likely indicate the true rates of persistence in England from 1300 to 1800.

Overall the rate of regression to the mean of these elite surnames suggests that there has been modest improvement in social mobility rates between the medieval era and the modern world, with that change occurring around 1800. But what is remarkable in both periods is the very high implied intergenerational correlation: 0.73 since 1800, 0.83 before 1800.

Why Are Social Mobility Rates So Low?

We can dismiss a couple of possible reconciliations of the high intergenerational correlation of status from surnames with conventional estimates. One is that the high degree of persistence applies only to the most elite families, with most families displaying higher rates of educational mobility. Another is a special barrier to entry in Oxbridge: families and their descendants belonged to an Oxbridge “club.”

The evidence that there is nothing special about the persistence of high-status families, or about Oxbridge as a measure of general status, appears when we look at another, more democratic measure of status—the fraction of people whose estates were probated at death. We used the 1858–2012 data from the national probate register for England. But only a fraction of the population, representing just those estates above a minimum value, was legally obliged to be probated. The fraction of all adults whose estates were probated at death was thus 15% in 1858–1889, rising to 47% by 1950–1966. When we measure wealth mobility using the fraction of surnames of a given type that appear in the probate records we thus measure mobility across a large share of the wealth distribution. If social mobility rates are higher outside elite families, the intergenerational correlation derived from probate records will be lower. If entry to Oxbridge is unusually persistent compared with less “clubby” measures of status, such as wealth, again the wealth correlation will be lower.

Probate frequencies for rare surnames in 1858–1966 were calculated from the Calendar of the Principal Probate Registry, as recorded on Probate frequencies for the years 1830–1857 were obtained from the Indexes of Wills and Administrative Grants of the Prerogative Court of the Archbishop of Canterbury (Public Record Office). The share of deaths in each generation from the rare surname group was taken to be the same as the share of the population reported in Table 3.

Figure 7 graphs the estimated mean social status of rare surnames (500 or fewer holders of that name in 1881) found at Oxbridge in 1800–1829, based on their relative representation in probate records from 1830 to 1966. For 1858–1966 these are national probate records. For 1830–1857 these are the estates probated in the Canterbury court, which under the ecclesiastical probate system represented the richest estates, with about 4% of all adult males’ estates probated here. People who died in 1830–1858 would include many from the generation who attended Oxbridge in 1800–1829, since life expectancy at 25 in England was then 30 additional years. Figure 7 also shows the best-fitting intergenerational correlation for these five generations. That b is 0.81, and once again shows remarkable stability across these generations. The R2 of the fit is 0.96. In a related paper using similar methods and the Canterbury Prerogative Court records from 1710 to 1858 we show that the implied b for wealth mobility in Industrial Revolution England is 0.77–0.82, no higher than for the modern era (Clark and Cummins 2014a).
Fig. 7

Mobility measured by relative probate frequencies, Oxbridge elite in 1800–1829. The group of rare surnames represented at Oxbridge in 1800–1829 regresses to the mean in terms of wealth, as measured by probate frequency, at a rate similar to that seen with educational status

This wealth b of 0.81 shows that the remarkable status persistence found using Oxbridge attendance as the status measure is just as strong for a more general and democratic measure of status such as asset ownership. There is no special persistence at Oxbridge, or in education, or only in the upper reaches of status. The high and stable wealth b shows once again the remarkable irrelevance of institutions to social mobility. Over these generations there were substantial increases in the rate of taxation of wealth and income, especially after 1910. Yet this did nothing to increase rates of wealth mobility (Clark and Cummins 2014b).

The similar magnitude of the estimated b for educational status and for wealth is consistent with the hypothesis that a deeper, latent social status of families correlates much more highly across generations than any individual status component. This implies also that if we find surname groupings with high status on any aspect of social status at one time, they will be equivalently high status on any other measure of social status. What is being measured in this way is generalized social mobility.

The relative constancy of the intergenerational correlation of underlying social status across very different social environments in England from 1800 to 2012 suggests that it stems from the nature of inheritance of characteristics within families. Strong forces of familial culture, social connections, and genetics must connect the generations. There really are quasi-physical “Laws of Inheritance.” This interpretation is reinforced by the finding of Clark et al. (2014) that all societies observed—including the USA, Sweden, India, China, and Japan—have similar low rates of social mobility when surnames are used to identify elites and underclasses, despite an even wider range of social institutions.

Appendix: Surname Source Materials

Brasenose College (1909) Brasenose College register, 1509–1909. Oxford, Basil Blackwell.

Cambridge University (1954) Annual register of the University of Cambridge, 1954–5. Cambridge: Cambridge University Press.

Cambridge University (1976) The Cambridge University list of members, 1976. Cambridge: Cambridge University Press.

Cambridge University (1998) The Cambridge University list of members, 1998. Cambridge: Cambridge University Press.

Cambridge University (1999–2010) Cambridge University reporter. Cambridge: Cambridge University Press.

Elliott, Ivo (ed.) (1934) Balliol College register, 2nd edition, 1833–1933. Oxford: John Johnson.

Emden, Alfred B. (1957–9) A biographical register of the University of Oxford to AD 1500 (3 vols.) (Oxford: Clarendon Press.

Emden, Alfred B (1963) A biographical register of the University of Cambridge to 1500. Cambridge: Cambridge University Press.

Emden, Alfred B (1974) A biographical register of the University of Oxford AD 1501 to 1540. Oxford: Clarendon Press.

Foster, Joseph (1887) Alumni Oxonienses: the members of the University of Oxford 1715–1886: their parentage, birthplace and year of birth, with a record of their degrees: being the matriculation register of the university, 4 vols. Oxford: Parker.

Foster, Joseph (1891) Alumni Oxonienses: the members of the University of Oxford 1500–1714: their parentage, birthplace and year of birth, with a record of their degrees: being the matriculation register of the university, 2 vols. Oxford: Parker.

Foster, Joseph (1893) Oxford men and their colleges, 1880–1892, 2 vols. Oxford: Parker.

International Genealogical Index, 2013. England marriages, 1538–1973. Index. FamilySearch., accessed 2013

Office of National Statistics database of surname frequencies in England and Wales, 2002, as listed at

Oxford University (1924, 1972, 1981, 1996, 2000, 2004–2008, 2010) The Oxford university calendar. Oxford: Clarendon Press.

Public Record Office (1904) Calendar of inquisitions post mortem and other analogous documents preserved in the public record office, Vol. 1: Henry III. London: Public Record Office.

Public Record Office (1906) Calendar of inquisitions post mortem and other analogous documents preserved in the public record office, Vol. 2: Edward I. London: Public Record Office.

Public Record Office, Indexes of Wills and Administrative Grants of the Prerogative Court of the Archbishop of Canterbury (series PROB 12).

Venn, John, and Venn, John A (1922–1927) Alumni Cantabrigienses, a biographical list of all known students, graduates and holders of office at the University of Cambridge, from the earliest times to 1751, 4 vols. Cambridge: Cambridge University Press.

Venn, John, and Venn, John A. (1940–1954) Alumni Cantabrigienses, a biographical list of all known students, graduates and holders of office at the University of Cambridge, 1752–1900, 6 vols. Cambridge: Cambridge University Press.

E-mail Directories

Oxford (2010–2012):


Women at Cambridge, 1860–1900.

Oxford University Gazette (1980–2012),

Cambridge University Reporter, special issue, “Student Numbers”: 1999–2000 to 2009–2010


  1. 1.

    The intergenerational correlation for height is 0.64 (Silventoinen et al. 2003). Grönqvist et al. (2011), however, estimate that in modern Sweden the intergenerational correlation of cognitive ability is as high as 0.77.

  2. 2.

    There are exceptions in which children took the surname of the mother, such as illegitimate children, but these can be seen to be a very small share of all births. Until recently more than 99% of children in England had their surname registered as that of the father.

  3. 3.

    In psychometric terms, underlying status is a latent variable.

  4. 4.

    We use the English convention of referring to Oxford and Cambridge together as Oxbridge.

  5. 5.

    Ashton (1977) estimates that students recorded for Oxford in 1170–1500 were only 20–25% of actual numbers.

  6. 6.

    To eliminate surnames for which most of the holders would be non-English, the rare surname samples excluded as far as possible names whose population concentrations lay outside England. Thus all names beginning with “Mc” or “Mac” or “O’” were removed since they are of Scottish or Irish origin. Also, any surname with more than 40 occurrences in the 1881 census was removed if its frequency in 2002 was more than 2.5 times the earlier frequency (the expected frequency would be 1.85). A check using surnames represented by 500 or fewer holders in 1881 found that even including all surnames in the sample did not change the estimated b by much.

  7. 7.

    The first period, 1800–1829, in which the elite surnames are identified, cannot be used to estimate b, since in this period we do not have the expectation that \( {\overline{y}}_{kt}={\overline{x}}_{kt} \), unlike in Eq. (7). In this first period the average status of the surnames is overestimated by their relative representation at Oxbridge, since the surnames included will tend in that period to have a positive random component in terms of their educational status.


  1. Ashton, T. S. (1977). Oxford’s Medieval alumni. Past & Present, 74, 3–40.CrossRefGoogle Scholar
  2. Atkinson, A., Maynard, A., & Trinder, C. (1983). Parents and children: Incomes in two generations. London: Heinemann.Google Scholar
  3. Clark, G., & Cummins, N. (2014a). Inequality and social mobility in the Industrial Revolution era. In R. Floud, J. Humphries, & P. Johnson (Eds.), The Cambridge economic history of modern Britain. Cambridge: Cambridge University Press.Google Scholar
  4. Clark, G., & Cummins, N. (2014b). What is the true rate of social mobility? Surnames and social mobility in England, 1830–2012. Economic Journal. doi: 10.1111/ecoj.12165.Google Scholar
  5. Clark, G., & Cummins, N. (2014c). Malthus to modernity: England’s first demographic transition, 1760–1800. Journal of Population Economics. doi: 10.1007/s00148-014-0509-9.Google Scholar
  6. Clark, G., Cummins, N., Diaz Vidal, D., Hao, Y., Ishii, T., Landes, Z., Marcin, D., Mo Jung, K., Marek, A., & Williams, K. (2014). The son also rises: 1,000 years of social mobility. Princeton: Princeton University Press.Google Scholar
  7. Clark, G., & Hamilton, G. (2006). Survival of the richest. The Malthusian mechanism in pre-industrIal England. Journal of Economic History, 66(3), 707–36.Google Scholar
  8. Corak. M. (2012) Inequality from generation to generation: The United States in comparison. In R. Rycroft (Ed), The economics of inequality, poverty, and discrimination in the 21st century. Santa Barbara: ABC-CLIO.Google Scholar
  9. Dearden, L., Machin, S., & Reed, H. (1997). Intergenerational mobility in Britain. Economic Journal, 107, 47–66.CrossRefGoogle Scholar
  10. Ermisch, J., Francesconi, M., & Siedler, T. (2006). Intergenerational mobility and marital sorting. Economic Journal, 116, 659–679.CrossRefGoogle Scholar
  11. Galton, F. (1869). Hereditary genius: An enquiry into its laws and consequences. London: Macmillan.CrossRefGoogle Scholar
  12. Galton, F. (1889). Natural inheritance. London: Macmillan.CrossRefGoogle Scholar
  13. Greenstein, D. (1994). The junior members, 1900–1990: a profile. In B. Harrison (Ed.), The history of the University of Oxford, volume VIII. Oxford: Clarendon Press.Google Scholar
  14. Grönqvist, E., Öckert, B., Vlachos, J. (2011) The intergenerational transmission of cognitive and non-cognitive abilities. IFN Working Paper No. 884. Available at doi:  10.2139/ssrn.2050393.
  15. Harbury, C., & Hitchens, D. (1979). Inheritance and wealth inequality in Britain. London: Allen and Unwin.Google Scholar
  16. Hertz, T., Jayasundera, T., Piraino, P., Selcuk, S., Smith, N., Verashchagina, A. (2007) The inheritance of educational inequality: International comparisons and fifty-year trends. The B.E. Journal of Economic Analysis & Policy 7, Article 10.Google Scholar
  17. Keats-Rohan, K. S. B. (1999). Domesday people: A prosopography of persons occurring in English documents 1066–1166. Woodbridge: Boydell Press.Google Scholar
  18. Long, J. (2013). The surprising social mobility of Victorian Britain. European Review of Economic History, 17(1), 1–23.CrossRefGoogle Scholar
  19. Pearson, K., & Lee, A. (1903). On the laws of inheritance in man, I: inheritance of physical characters. Biometrika, 2, 357–462.CrossRefGoogle Scholar
  20. Schurer, K., & Woollard, M. (2000). 1881 Census for England and Wales, the Channel Islands and the Isle of Man (enhanced version) [computer file]. Genealogical Society of Utah, Federation of Family History Societies, [original data producer(s)]. Colchester, Essex: UK Data Archive [distributor]. SN: 4177, doi: 10.5255/UKDA-SN-4177-1.
  21. Silventoinen, K., et al. (2003). Heritability of adult body height: a comparative study of twin cohorts in eight countries. Twin Research, 6, 399–408.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of EconomicsUniversity of CaliforniaDavisUSA
  2. 2.Department of Economic HistoryLondon School of EconomicsLondonUK

Personalised recommendations