Heritability of lifetime earnings

Using twenty years of earnings data on Finnish twins, we find that about 40% of the variance of women’s and little more than half of men’s lifetime labour earnings are linked to genetic factors. The contribution of the shared environment is negligible. We show that the result is robust to using alternative definitions of earnings, to adjusting for the role of education, and to measurement errors in the measure of genetic relatedness.

environmental factors, such as common family background, neighbourhood and peers, and genetically inherited traits. For example, Björklund and Jäntti (2012) estimated using Swedish data that shared environmental and genetic factors explain 40-60% of inequality in a number of productive traits, including cognitive and non-cognitive skills, schooling and long-run earnings.
Heritability measures the extent to which genetic variation between individuals account for differences in a particular outcome, in a particular population, characterized by a particular mix of genetic and environmental influences that prevailed at the time of measurement (Plomin et al. 2014). Earnings can be transferred genetically through several channels. There is a large literature in economics (e.g. Heckman et al. 2006) showing that (heritable) non-cognitive aspects of personality, such as various personality traits or addictions, and cognitive skills can have an influence on, for example, occupational choices, labor supply, work effort, and risk taking, which all influence earnings. Inherited cognitive skills and non-cognitive traits also affect schooling choices and thereby earnings through the returns to education.
Our contribution is fourfold: First, we provide new evidence on the genetic heritability of lifetime (labour) earnings and total lifetime earnings (incl. capital income). In contrast to the existing heritability literature that has mostly relied on relatively short-term proxies for lifetime earnings, our evidence is based on twenty years of data on a large number of monozygotic (MZ) and dizygotic (DZ) twin pairs. 1 Our measure of earnings refers to earnings in the form of wages, salaries and self-employment income, but excludes social transfers, such as unemployment benefits. Total lifetime earnings also include capital income, which consists of taxable dividends, interest income and capital gains. The information on earnings and income comes from tax registers and is therefore not subject to self-reporting error. The twin cohorts that we use are old enough that we can use data on the various types of income in their prime working age to measure lifetime earnings.
Second, we examine heritability of earnings by gender. Analysis separately by gender is important, as it is well known that men's and women's earnings develop differently over their working careers, for example because of women's more frequent career breaks or because of gender differences in risk preferences and in other socio-psychological factors (e.g. Killingsworth and Heckman 1986;Bertrand 2011). Also the influence of shared or nonshared environment vs. heritability may differ by gender e.g. in career choices.
The literature on the heritability of economic outcomes has been criticized in the past (Goldberger 1979) and more recently (Manski 2011) of being not only policy irrelevant but also harmful, as heritability research has been misused for political and other ends. We share the concern of potential misuse, but disagree with the implied suggestion that genetic heritability of economic outcomes ought not to be studied at all. Heritability is a descriptive statistic in genetic research (Plomin et al. 2014). It does not imply immutability. Showing that a policy intervention can reduce economic inequality ("what could be") is not the same thing as learning about the genetic and environmental origins of inequality, as they existed ("what is") in the economy that generated the data researchers use -like e.g. Plomin et al. (2014) emphasise. Consistent with this, Björklund and Jäntti (2012) argue that genetic inheritance cannot be neglected if we pursue a deeper understanding of the influences of family background.
We explore the origins of variation in lifetime earnings in Finland, as it existed during the period from 1990 to 2009. This institutional environment is of broader policy interest, because a robust finding in the recent literature is that the relatively equitable Nordic countries have high intergenerational mobility, exceeding clearly that of the UK and US (Black and Devereux 2011). Consistent with this, the correlation of incomes among siblings is much lower in the Nordic countries than in the U.S. (Solon 1999;Jäntti et al. 2002;Black and Devereux 2011). However, the question of the persistence of economic outcomes across generations is far from solved (Lucas and Pekkala Kerr 2013). How much of the variation in lifetime earnings is related to genetic variation is therefore worthwhile to know, not least because it provides a useful benchmark against which other estimates and other (possibly less equitable) countries can be compared. In this spirit, Landersø and Heckman (2017), for example, compare how intergenerational mobility and its determinants differ between Denmark and the US.
Our main findings are as follows: Using accurate administrative data on twins' prime working-age work and capital income and standard behavioural genetics designs, we document that genes explain a reasonably high share of variation in the twins' (age-adjusted) lifetime earnings (54% for men; 39% for women), whereas the shared environment explains very little. Our results thus echo those reported for Sweden by Benjamin et al. (2012), as they also find that the shared environment explains a small fraction of variation in long-term earnings. 2 These findings are in line with the much broader literature on the relative importance of shared and non-shared environment in explaining variation in many kinds of complex traits (phenotypes), suggesting that environmental influence for most traits is typically nonshared (Plomin 2011).
In auxiliary analyses, we also explore how sensitive the estimates of heritability are to the removal of the effect of education on lifetime earnings and total earnings. Education is one channel through which earnings can be transferred genetically. We focus on education, because schooling is known to have high intergenerational persistence, depends on genetic endowments (e.g. Behrman and Taubman 1989;Miller et al. 2001;Branigan et al. 2013), and is a key driver of permanent income. We show that in the relatively equitable economic and institutional environment of Finland, the share of variance of lifetime earnings explained by education is clearly less than a tenth (in our data). This comparison suggests that the variation in lifetime earnings that can be attributed to genetic variation is not negligible and warrants attention. The results of our auxiliary analyses also suggest -but do not conclusively showthat removing the effects of education on the lifetime earnings of the cohorts we study does not change these heritability estimates.
We also provide estimates of group heritability by analysing the importance of heritability of earnings at different points of the earnings distribution. It is possible that e.g. certain personality traits have a particularly strong impact on top (or bottom) earnings, leading to variation in earnings heritability across the earnings distribution. However, if the difference between top (or bottom) earnings and the earnings of the whole sample is heritable, the same genetic factors are related to earnings at all parts of the earnings distribution. Group heritability allows measuring how much genetics account for of the mean difference in lifetime earnings between those who are at the tails of the earnings distribution and the rest of the population. It hence allows highlighting whether and why individuals with very high or very low earnings differ as a group from the rest of the population (DeFries and Fulker 1985;Plomin et al. 2014). We find that the heritability of mean earnings in the tails of lifetime earnings distribution broadly follows similar patterns as that of individuals at large. Group heritability suggests therefore that 2 Relaxing some of the assumptions of the standard variance decomposition reduces the share of income explained by genetic heritability; see Björklund et al. (2005). earnings at the extreme parts of and in the rest of the distribution are, at least in part, related to the same genetic factors (Plomin and Kovas 2005;Shakeshaft et al. 2015).
Our findings bear on two ongoing debates. First, they bear on the determinants of intergenerational mobility at the top of the earnings and income distributions in equitable Nordic countries. For example,  show using Swedish data that intergenerational transmission of men's long-term income is quite low in general and similarly modest in the top 10% of the distribution, except in the very top 1 and 0.1%.  argue that the strong intergenerational transmission at the very top percentiles is related to inherited wealth. Second, our analyses also bear on the related debates of the heritability of inequality (e.g., Bowles and Gintis 2002) and of the determinants of changes at the tails of the income distribution (e.g., Piketty and Saez 2003;Chetty et al. 2014). Our findings suggest that the importance of genetic variation in explaining variation in longterm earnings should not be overlooked.
The remainder of this paper is organized as follows. In the next section, we discuss the existing evidence. We then present in section three the Finnish twin and register data and how we measure lifetime labour earnings and total lifetime (market) earnings. The fourth section describes our main results. The final section concludes.

Prior evidence on heritability of earnings and income
The economic literature that uses twin data to analyse the determinants of the variance of earnings and income began with Taubman (1976). 3 A great advantage of twin data is that it allows measuring how genetic, shared environmental and individual-specific (non-shared environmental) factors contribute to the variance of earnings. The relative contributions of these factors can under certain assumptions be identified, because MZ and DZ twins have a shared (family) environment, but unlike MZ twins, DZ twins share, like non-twin siblings, only one-half of their genes, on average. Greater similarity in outcomes between the MZ twins is therefore indicative of the importance of genes.
According to the standard behavioral genetics decomposition (Posthuma et al. 2003), the genetic heritability of an outcome, such as lifetime earnings, is twice the difference of the correlations of the lifetime earnings between MZ and DZ twins, i.e., h 2 = 2(r MZ − r DZ ) and the fraction of variance explained by the shared environment is c 2 = r MZ − h 2 = 2r DZ − r MZ . The fraction explained by non-shared environment is 1 − h 2 − c 2 = 1 − r MZ . This simple decomposition assumes i) that genes and environment have additive effects, ii) that MZ twins experience environments that are similar to those of DZ twins, iii) that there is no correlation between genetic factors and the shared environment (i.e., within-pair genetic differences are not correlated with the within-pair environmental differences; see e.g. Stenberg (2013) who stresses the importance of this assumption for the interpretation of heritability estimates), and iv) that there is no assortative mating. The last assumption would not hold if the genotypes of the parents were correlated (Posthuma et al. 2003). Table 1 reports from a number of prior studies the sibling correlations of earnings (or income) for MZ and DZ twins as well as the (implied) heritability estimates that can be obtained from the standard additive variance decomposition. This decomposition requires that the correlation of lifetime earnings within the MZ twin pairs, r MZ , should be bigger than that of the DZ twin pairs, r DZ , and that 2r DZ should be at least as big as r MZ . The standard decomposition results in the so-called ACE model, where A, C, and E stand for additive genetic, shared (common) environment, and nonshared environment components, respectively. When 2r DZ < r MZ , we have in Table 1 set for simplicity the estimate of the variance share of the shared environment (c 2 ) to zero, and subtracted the negative estimate from the heritability estimate. This effectively gives the so-called  (2005) Log ( h 2 = 2*(r MZ -r DZ ), c 2 = r MZ -h 2 , and e 2 = 1-h 2 -c 2 refer to the standard additive behavioral genetics variance decomposition. In the cases where this decomposition gives a negative value for c 2 , it has been set to zero, and the corresponding value has been deducted from h 2 . Earnings (income) data refer to a cross-section in the US and Australian studies. Ashenfelter and Rouse (1998) average the income over time for those twins (25% of the sample) who were interviewed more than once. They do not show the correlations, but those are reported in Harding et al. (2005, fn. 4). In Miller et al. (1995Miller et al. ( , 1997) the earnings measure is the average full time income from the occupation of employment, measured at the level of 2-digit, gender-specific occupational groups (i.e., it is not measured at the level of individuals). Johnson and Krueger (2005) use household rather than individual income. Isacsson (1999) and Björklund et al. (2005) use incomes from 3 years over a 7-year period and Cesarini (2010) from 3 years over a 5-year period. Benjamin et al. (2012) use data from consecutive years. They also show the correlations for 10-year and 3-year average log incomes, which are not reported here. Most of the multi-year studies adjust the incomes for age. M = men, W = women ADE model, where D stands for non-additive genetics (dominance) effects (see also section 4.1. below).
The following preliminary observations can be made: First, the US estimates for the importance of the genetic component, h 2 , are close to those reported for Sweden. Second, the genetic component accounts for as much as 40% of earnings (or income) variation. Third, consistent with prior behavior-genetic findings (Plomin 2011), the shared environment (c 2 ) accounts for a relatively small fraction, say at most 10% or so, of the variance of the earnings (or income). Fourth, the individual-specific non-shared environmental factors (e 2 ) account for roughly half of the variation in earnings (or income).
A particular challenge in prior work has been that the object of primary interest, lifetime earnings or income, has often been measured using poor proxies. 4 This issue has been widely discussed in the literature on intergenerational mobility. The use of short run income may lead, for example, to a gross underestimation of the strength of the intergenerational links (e.g., Mazumder 2005;Haider and Solon 2006;Nilsen et al. 2012). As Table 1 shows, most of the prior work uses a single cross-section and short-term income measures, such as annual earnings or hourly salary. Notable exceptions are the studies by Isacsson (1999) and Björklund et al. (2005), which both use three years of earnings data on Swedish twins over a spell of seven years, and Benjamin et al. (2012), who use up to 20 years of Swedish earnings data.
Besides the studies that focus on the heritability of income, there are a number of papers that apply variance decompositions to twin data in order to determine the importance of genetic and environmental factors for the variation of economic outcomes (see also Sacerdote 2011 for a review). This branch of the literature includes Behrman and Taubman (1989) and Miller et al. (2001), who investigate the genetic heritability of education, Miller et al. (1996) and Schnittker (2008), who focus on occupational status and socioeconomic position, and Nicolaou et al. (2008), who examine the effect of genetic heritability on the likelihood of becoming an entrepreneur. More recent work has extended the literature by studying the genetic heritability of the formation of preferences (Cesarini et al. 2009;Simonson and Sela 2011), financial decision-making (Barnea et al. 2010;Cesarini et al. 2010), and savings (Cronqvist and Siegel 2015).

Data sources
Our twin data are based on the Older Finnish Twin Cohort Study (of The Department of Public Health in University of Helsinki) that was matched to the Finnish Longitudinal Employer-Employee Data (FLEED) of Statistics Finland using personal identification numbers (see also Hyytinen et al. 2013 and the online appendix).
The Finnish Cohort Study was established in 1974 and was initially compiled from the Central Population Registry of Finland. Initial twin candidates were persons born before 1958 with the same birth date, municipality of birth, sex, and surname at birth (Kaprio et al. 1979;Kaprio and Koskenvuo 2002;Kaprio 2013). A questionnaire was mailed to these candidates in 1975 to determine zygosity and to collect baseline data (see also the online appendix). The response rate was 89%. Two follow-up surveys were then subsequently done in 1981 and 1990.
We linked the twin data to FLEED using personal identifiers (see also Hyytinen et al. 2013). FLEED is constructed from a number of different administrative registers on individuals, firms and establishments that are collected or maintained by Statistics Finland. Importantly for this study, FLEED includes information on salaries and other income, taken directly from tax registers. This means that our earnings data are not biased by underreporting or recall error, nor do the data suffer from top-coding. The earnings data used in this study cover the 20year period from 1990 to 2009.

Sample
We focus on the youngest cohort of our data, born in 1950-1957. This cohort obtained their primary and secondary schooling in the old, more selective, Finnish school system (for a nice description, see Pekkarinen et al. 2009). In the old system, there was a tracking of students to vocational and academic tracks after the fourth grade at the age of 11. In 1972-1977, the system was reformed so that a comprehensive school was established where all students obtain nine years of common education. The youngest twins in our data were 15 years old when the reform started, so they were not affected by it. Our sample contains nearly all same-sex DZ and MZ twins of this cohort of the Finnish population. Most of the attrition is due to death (e.g., of fatal diseases or accidents) and migration.
We examine men and women separately. There are many reasons to expect that the development of lifetime earnings is different between the genders. For example, women have more career breaks than men due to family reasons, which create bigger variation in earnings across individuals among women than among men. There are gender differences also in many choices that affect earnings, like risk taking or educational and occupational choices. To the extent that these choices are affected by inherited personality traits, also heritability of earnings should differ by gender.
In our estimation sample, male MZ twins lived together, on average, 20.7 years before they moved apart. For male DZ twins, the corresponding average is close, 0.7 years less. The difference is a bit larger for female twins: Female MZ twins lived together, on average, 20.3 years before they moved apart. Female DZ twins moved apart on average 1.8 years earlier.

Variable definitions and descriptive statistics
Our measure for the lifetime earnings (i.e., work income) of an individual is the average of (the logarithm of) the individual's wage and salary earnings and self-employment income, converted to euros, deflated to year 2000 euros using the consumer price index, and calculated over the sample period; see also Böckerman et al. (2017), who use a similar measure of long-run income, calculated from the FLEED data. The findings of Haider and Solon (2006) for the U.S. and those of Böhlmark and Lindquist (2006) for Sweden suggest that this long-term sample average ought to be a reliable measure for the lifetime earnings. In particular, because we use a sample of individuals born between 1950 and 1957, we observe the earnings of individuals who are at their prime working age: The individuals are from 33 to 40 years old at the beginning of our sample period in 1990 and from 52 to 59 years old at the end of the sample period in 2009. This window matches quite nicely the periods when annual earnings is a good proxy for the long-term earnings, especially for men. Total lifetime earnings are calculated in a similar way, but the difference is that it also includes capital income (i.e., taxable dividends, interest income and capital gains). We use total lifetime earnings in the robustness analysis. Table 2 reports the means and standard deviations of (unadjusted) earnings and total lifetime earnings, age, and education years based on standard degree times, separately for MZ and DZ twins by gender. As the table shows, the average age (in 1990) is 36 years and the average amount of schooling is twelve years. Average lifetime earnings of men is around 23,000 euros per year, whereas for women, it is 17,000 euros per year.
Since we observe the individuals at different stages of their life cycles, we adjust the earnings for age and year for our empirical analysis. We obtain the adjusted income from a regression of the log of annual real earnings on a constant, calendar year dummies and a third order polynomial of age, separately for men and women. The age-adjusted lifetime earnings are then computed as the within-individual average of the residuals from these regressions. Table 2 also reports within-twin pair correlations, using the adjusted lifetime earnings measures. With these correlations, the standard additive variance decomposition can be applied to lifetime earnings, using the formulas presented in section 2 for the shares of genetic heritability, shared environment and non-shared environment. The estimate of shared environment, c 2 , would be negative for both genders, as 2r DZ < r MZ . One potential reason for this is the presence of dominant (non-additive) genetic factors, which tend to make outcomes more similar for MZ twins relative to DZ twins. The data are suggestive of dominance effects, if 2r MZ − 4r DZ > 0, which is the case in our data for the lifetime earnings of both genders. A negative estimate of c 2 may be added to the estimate of genetic heritability (h 2 ), giving a baseline heritability estimate of 54% for men and 41% for women. These estimates are a bit higher than what we reported in Table 1 for other countries. This observation is consistent with the view that the shorter-term earnings measures lead to lower heritability estimates: A low within-pair correlation suggests that the unshared environmental effects are important, but it may also mirror measurement error at the level of individuals.

Method: DeFries-Fulker variance decomposition
We measure the importance of genetic factors and shared environment for lifetime earnings separately for both genders using the regression model proposed by DeFries and Fulker (1985), and further developed by Waller (1994), Kohler and Rodgers (2001) and Rodgers and Kohler (2005), among others. This model and its closely related variants have also been used in earlier economics research (see, e.g., Miller et al. 1996Miller et al. , 2001. The most basic version of the DeFries and Fulker (DF) model is a regression model that relies on the (abovementioned) assumptions of the standard behavioral genetics decomposition, i.e., the ACEmodel (Posthuma et al. 2003). The ACE model can be written as where INC ji is a measure of the lifetime earnings of twin i in twin pair j, INC ji' is the corresponding measure for co-twin i' from the same twin pair j, R j is the coefficient of genetic relatedness (R = 1.0 for MZ twins and R = 0.5 for DZ twins), and ε ji is an error term. If the assumptions of the additive genetic model are satisfied, β 1 and β 3 are unbiased estimates of c 2 and h 2 , respectively (DeFries and Fulker 1985;Rodgers and McGue 1994). An alternative way to think about the DF model is that it is a regression-based method to match moments, i.e., to fit the parameters of the decomposition model to the observed MZ and DZ correlations. Alternative versions of the DF regression model can also be considered. 5 One possibility is to drop the shared environmental term from Eq. (1) by imposing β 1 = 0. The term is often dropped also when the estimate for β 1 is statistically not significant in the ACE model. The resulting model is called the AE-model.
If the estimate for the variance share of the shared environmental factors turns out to be negative, the ACE model is not consistent with the decomposition. One potential reason for the negative estimate for the variance share of the shared environmental factors is that genetic effects are not additive, but of a dominant form. To be more specific, genetic effects on a trait are the sum of all effects of single genes and their interaction. Genes can have different effects due to genetic variation at a single base pair in the genome or to larger genetic structural variation. The variants at a locus in a gene are known as alleles. If the effect of carrying no, one or two alleles (as humans have two DNA strands) is additive on the trait, these are summed as additive genetic effects. Non-linear effects at a single locus are termed dominance, while interactions between loci result in effects that are termed epistasis. Additive effects are transmitted from parents to children, while effects due to dominance are not correlated between generations. Broad sense heritability refers to all kinds of genetic contributions, including additive, dominant, and epistatic. Narrow sense heritability refers solely to the additive genetic factors. (See Posthuma et al. 2003.) The data are suggestive of dominance effects, if 2r MZ -4r DZ > 0. Such effects can be accommodated in the DF-model by reformulating it as where D j is the coefficient of dominant genetic relatedness (with D = 1 for MZ twins and D = 0.25 for DZ twins; Waller 1994. This model is the ADE-model. In (2), β 3 estimates narrow-sense heritability, β 4 the dominance effect, and β 3 +β 4 estimates broad-sense heritability (Waller 1994). As noted in Section 3 and as can be seen from Table 2, our data for lifetime earnings is suggestive of dominance effects, as 2r MZ is higher than 4r DZ for both genders.
In (1) and (2), the value for twin i' of a pair of twins is an explanatory variable for twin i's outcome. However, it is not possible a priori to decide which of the twins is twin i and which is twin i'. The DF regression analysis is therefore performed in the double entry form, i.e. each twin pair is entered into the data twice: The first observation uses the outcome of the first twin as the dependent variable and that of the co-twin as the explanatory variable. The second observation reverses the roles. This procedure means that standard errors should be clustered at twin pair level for correct inference (see Kohler and Rodgers 2001), which we do. Table 3 presents the results of DF-regressions for the ACE, AE, and ADE models for lifetime earnings for women and men. As can be seen from the table, the estimate for the variance component of the shared environment (c 2 =β 1 ) is negative in the ACE model for both genders. This finding and the fact that 2r MZ is higher than 4r DZ for both genders suggest that alternative models ought to be considered and that dominance effects may be present (Waller 1994;Rodgers et al. 2001). The AE and ADE models suggest a similar degree of genetic heritability. In the ADEmodel, broad heritability refers to the sum β 3 +β 4 and is 41% for women. The AE model is consistent with this, suggesting that the estimate of h 2 (=β 3 ) is 39% for females. Based on the Akaike information criterion (AIC), AE is marginally preferred to ADE for females. The 95% confidence interval for the heritability estimate h 2 from the AE model is (32%, 47%). For men, the AE and ADE models suggest that the estimate of h 2 is about one half: the former suggests that the estimate of h 2 is 49% and the latter that the broad heritability is 54%. Based on the AIC criterion, the ADE model is preferred. The 95% confidence interval for h 2 from this model is (45%, 64%). These findings are in line with what we found from the simple decompositions based on Table 2. In sum, we find that genes explain a reasonably high share of variation in the twins' lifetime earnings (54% for men; 39% for women), whereas the shared environment explains very little. Even if no preferred model was used, all models suggest qualitatively similar heritability: heritability estimates for women are in range 39%-50% and for men in range 50%-70%. Table 3, it is useful to recall that heritability measures the extent to which genetic variation between individuals account for differences in a particular outcome, in a particular population, characterized by a particular mix of genetic and environmental influences that prevailed at the time of measurement (Plomin et al. 2014). Thus, to put the heritability numbers into a perspective, we can consider a simple univariate regression in which lifetime earnings is regressed on schooling years. In these simple models (not reported), the coefficient of determination (R 2 ) is 0.05 for women and 0.06 for men in the pooled sample of DZ and MZ twins. These numbers are much smaller than the share of lifetime earnings explained by genes, suggesting that genes explain a rather significant part of population variation. The reason is that earnings can be transferred genetically through several channels, such as aspects of personality and cognitive skills. These traits and skills are partly genetically inherited and influence earnings because of their association with work effort, risk taking, schooling choices, occupational choices, and labor supply.

Robustness
We have checked the robustness of the results displayed in Table 3 in seven ways. We considering each of them in turn without reporting the results in tables: First, we ran the DF-regressions using a larger sample that included both twin pairs born between 1945 and 1949 and those born between 1950 and 1957. The results for earnings were very similar to those obtained with our baseline sample based on the younger cohort.
Second, the main results reported in Table 3 are robust to not doing the age adjustment, i.e. using mean of log real earnings as the dependent variable.
Third, we used the (logarithm of) total lifetime earnings as the income measure. This includes, in addition to earnings, also capital income, which consists of taxable dividends, interest income and capital gains. Information on capital income, and hence on total lifetime earnings, is available from 1993 to 2009. This income measure gave almost the same results as earnings. The preferred model for women was again AE, indicating heritability estimate 40% for total lifetime earnings. For men, the ADE model produced heritability estimate 53%. Fourth, we considered how measurement error in additive genetic relatedness, R, affects our results. This variable includes some measurement error, as it is equal to 0.5 only in expectation for the DZ twins. Visscher et al. (2006) report that the standard deviation of genetic relatedness of (non-MZ) siblings is 0.036. Using 0.0013 as the variance of the noise in R for the DZ twins (and zero for MZ twins), the reliability of the R variable is 1 − s DZ 0.013/Var(R), where s DZ is the share of DZ twins and the variance of R is calculated over both MZ and DZ twins. It turned out that this reliability measure (and a corresponding measure calculated for the interaction of R and earnings, assuming no measurement error in earnings) is very close to one. What this high reliability means is that the standard OLS estimation gave the same results as a method that accounts for errors-in-variables. 6 Fifth, we used alternative definitions of the outcome variable, using earnings for years when the individuals were close to 40. It has been argued that for men (but not necessarily for women) annual earnings in the age interval from early 30s to early 50s are a good proxy for lifetime earnings (Haider and Solon 2006;Böhlmark and Lindquist 2006). In our data, the twins are mostly in this interval, as they are 33-40 years old in 1990 and 52-59 years old in 2009. To investigate the issue further, we estimated the models successively for those at age 40, those at ages 39-41, those at ages 38-42, and those at ages 35-45. In each case, the earnings variable was average of the logarithm of real earnings for the corresponding age interval. The results showed that with narrower age intervals, heritability h 2 was lower, but it increased with the widening of the interval. For men the h 2 estimates for the four intervals were 43%, 45%, 47%, and 53%, respectively, for the preferred ADE model, and for women 27%, 34%, 34%, and 37% respectively, for the preferred AE model.
Sixth, as an additional check we estimated the DF-regressions separately for each year in our data. The results showed that, like in Benjamin et al. (2012, see Table 1 above), the heritability estimate is on average lower if annual data are used. The average heritability estimate for men was 41.9% when calculated from the annual data, using the ADE model. There was quite a bit of variation over the years, as the standard deviation of annual estimates is 4.3%. For women, the average of the annual estimates (from the AE model) was 26.8%, which is also below the corresponding long-term estimate. The standard deviation of the women's annual estimates was 3.2%.
Finally, we tested formally whether the difference in heritability between women and men is statistically significant. We re-estimated the models for earnings so that women and men were pooled and the models included as additional variables also a dummy for females as well as all the variables interacted with the female dummy. The difference between men and women was statistically significant at the 10% level in the (preferred) AE and ADE models.

Auxiliary analyses
In this subsection, we provide a summary of two auxiliary analyses that we have conducted (see the online appendix for details). In the first of them, we analyze how sensitive the estimates of heritability are to the removal of the effect of education on lifetime earnings. In the second of the two auxiliary analyses, we explore group heritability, i.e., how much of the mean difference in lifetime earnings between those who are at the higher or lower tails of the earnings distribution ('probands') and the whole population can be attributed to genetics.
Education and heritability of lifetime earnings To "net out" the effect of education on earnings, we deduct the estimated effect of education from the age-adjusted earnings of each individual directly before performing the DF estimation. We produce the estimated effect by using a standard way to estimate returns to education with data on twins (e.g. Ashenfelter and Krueger 1994). The results do not differ much from those estimated without adjusting lifetime earnings for education. We again find that the heritability of lifetime earnings is about 40% for women and 50% for men.
Group heritability Group heritability measures the genetic influence on the difference between proband and population means, whereas the usual heritability estimate refers to genetic influence on individual differences in a sample. If strong group heritability is found, it implies that both the extreme earnings and the earnings of the rest of the distribution are heritable and, specifically, that the genetic contributions at the extremes and in the intermediate (normal) range are not independent from one another (Plomin and Kovas 2005). The method that we use to study group heritability is the DeFries-Fulker extremes analysis (DeFries and Fulker 1985;LaBuda et al. 1986;Bishop 2005;Plomin et al. 2014). We find that while there are some gender differences, the group heritability of lifetime earnings is overall fairly high for both genders and that the group heritability estimates are in line with the usual heritability estimates. These findings mean that the genetic contributions at the extremes and in the intermediate (normal) range are not independent from one another (Plomin and Kovas 2005). In sum, while earnings may be transferred genetically through several channels and while the specific channels may be different at the different points of the earnings distribution, our findings suggest that the degree of heritability is by and large similar in the tails and in population at large, and possibly linked to related genetic factors.

Conclusions
We have documented that about 40% of the variance of women's lifetime earnings -as measured over a 20-year period in their prime working age -is due to genetic factors. For men the corresponding share is a bit more than half. Consistent with the prior epidemiological and behavioral genetics literature on the heritability of complex traits (Plomin 2011), the contribution of the shared environment is negligible. Controlling for the effect of education on lifetime earnings does not change these findings. The heritability of the earnings at the upper and lower tails of lifetime earnings distribution follows broadly similar patterns. The relatively high estimates of group heritability indicate that earnings at the extreme parts of and in the rest of the distribution are related to the same genetic factors.
Our findings suggest two lessons for contemporary debates. First, we provided evidence on the genetic and environmental origins of lifetime earnings inequality, as they existed in a relatively equitable Nordic economy in 1990-2009. This piece of descriptive evidence is useful to know, as it demonstrates the importance of genetic variation for the lifetime earnings in a country where income inequality is perceived to be moderate by international standards. It does not, of course, imply that things could not be or could not have been otherwise. Second, our findings suggest that the genetic heritability of lifetime earnings is somewhat higher for men, especially at the lower end of the earnings distribution. This result bears on the debate on the documented differences between men and women in various economic outcomes during adulthood (e.g., Goldin 2014). Most prominent explanations for them appear to be trends and influences affecting the workings of and outcomes in the labor market (e.g., occupational preferences, fertility) and the apparently long-lasting effects of childhood environment and family background (e.g. Autor et al. 2016;Chetty et al. 2016).
The findings from our auxiliary analyses have implications for what mechanisms are at work. On the one hand, a number of non-cognitive traits, cognitive skills, and other sociopsychological factors have a genetic component and may partly rationalize why genetic factors explain variation in earnings. Our auxiliary analysis suggests (but does not prove) that whatever drives the explanatory power of genetic factors, they remain there when the effect of education, which is a key determinant of people's long-term earnings, is netted out. Moreover, genetic heritability plays a quite similar role in the upper and lower tails of the earnings distribution as it does in the population at large. This suggests that, for example for men, shared family background, which includes e.g. bequests, appear not to dominate variation in the sample in general or at the upper tail in particular. Our data are, however, not big enough for us to explore whether this also holds at the very top (1, or 0.1) percentiles. The available Swedish evidence suggests that it does not .
We conclude by acknowledging the limitations of our analysis. Our decompositions allowed genes and environment to have additive and dominant (non-additive) effects, but we assumed that MZ twins experience environments that are similar to those of DZ twins, that there is no assortative mating, and that there is no correlation between genetic factors and the shared environment. Our heritability estimates would be upwards biased if MZ twins experience more similar environments than DZ twins do. The evidence on this appears to be somewhat mixed and context specific, and in any case, most environmental influence for many traits and outcomes appears to be non-shared (Plomin 2011). On the other hand, assortative mating increases the similarity of parents, which in turn increases the genetic similarity of DZ twins (but not those of MZ twins, because they share their entire DNA, irrespectively of the similarity of their parents). This biases the heritability estimates downwards and inflates the estimates of the shared environment. It is much harder to sign the bias if there is correlation between genetic factors and the shared environment (Stenberg 2013).