Women in the Top of the Income Distribution: What Can We Learn From LIS-Data?

We explore the extent to which LIS-data can be used to shed light on the presence of women in the top of the income distribution. We show developments of the share of women in top groups (P90-100 and P99-100) of the labour income distribution for 28 countries and, when possible, compare to outcomes when including capital incomes. These turn out not to matter much for the share of women in top groups with some important exceptions. Relating our findings to the existing evidence on women in the top of the income distribution based on aggregate tax data, we find that LIS-data give a relatively accurate picture of the basic findings. However, we also note that once we divide the top1 group further, samples quickly become too small to allow further study. For countries where data allows such analysis, we find that having a partner and having children are positively associated with being in top income groups for men, but negatively associated for women. However, time interactions suggest that these differences have decreased over time. Also, top income men are more likely to have partners who are not in the top of the income distribution while this is not the case for top income women. All these results are surprisingly consistent across country groups.


Introduction
In recent years the so called top-income literature has received a lot of attention. Following the seminal work by Piketty (2001Piketty ( , 2003, Piketty and Saez (2003), and Atkinson (2005) a large number of studies have shown how important different aspects of the very top of the income distribution are for fully understanding both the recent increase in inequality observed in many countries, as well as for its longrun evolution. 1 In particular, this literature has stressed the importance of looking carefully at developments within the top, and also the importance of including all sources of income. For example, the developments of the top1 group are often very different from the rest of the top decile, and a key factor in explaining this difference often turns out to be the role of non-labour incomes.
One very basic aspect of top incomes has, however, received little attention in this literature, namely the gender dimension. If we observe a growing share of total incomes going to top groups, and also that top incomes are different in terms of income composition, it seems natural to also ask questions about the gender composition of this group and to what extent important dimensions of top incomes are different for men and women. What share of the top ten or top one group is made up of women? How has this changed over time? Are top income women and men similar in terms of income composition and in terms of observable characteristics and has this changed over time?
In this paper we will explore the extent to which LIS-data can be used to shed light on questions like these. We proceed as follows: we first explain why the gender aspect has not received more attention in the standard top income literature and also relate to how the gender dimension in this literature differs from the extensive work on different aspects of gender inequality in related literatures. We discuss how we select countries and years in the LIS data given the limits that especially sample size puts on the questions when focusing on groups as small as the top one percent of the distribution. 2 We then present basic results on the share of women in the top ten and top one group, for labour income, and whenever possible, compare the outcomes to those for labour plus capital income (as well as the total income as defined in the LIS data), for as many countries and years as possible. When possible, we also compare these results to the few recent studies in the top income literature that use individual tax data to study the share of women in top incomes (in particular, Atkinson et al. 2018, andBoschini et al. 2017). Comparisons suggest that our LIS-based results, though not perfectly aligned, in general come very close to the shares of women in top groups observed in comprehensive income tax data, but we note also that there are some important caveats to this overall conclusion.

3
Women in the Top of the Income Distribution: What Can We Learn… We then move on to using the richness of the LIS-data to study determinants of the probability that men and women respectively are found in the top of the income distribution. In particular, we study gender differences in some family variables such as children and the effect of having a partner, as well as some characteristics of that partner. We find that having children is associated with a lower likelihood of women being in the top of the income distribution, but with the opposite being the case for men (and with the gap generally increasing in the number of children). We also find that top income women are less likely to have a partner, but conditional on having a partner, it is likely that this partner is also a high-income earner. Top income men, on the other hand, are more likely to have a partner, and this partner is more likely not to be in the top of the income distribution. These associations are surprisingly similar across different country groups. Overall, the results underline the importance of understanding the combination of individual characteristics as well as family circumstances when thinking about what determines the relative presence of men and women in the top of the income distribution. This section, however, also serves as an illustration of when we quickly approach the limit of what can be meaningfully studied in the top of the distribution using LIS-data due to sample sizes becoming too small. We conclude with some remarks on what we find and suggestions for future research.

Why the Delay in Studying Women in the Top Income Literature?
Given the great interest in top incomes, one can wonder why the gender dimension has not received more attention earlier. The main reason lies in the fact that the unit of analysis in the top income literature has been determined by the availability of historical tax data, which for most countries has meant that married couples count as one unit making the division between (married) men and women problematic. 3 While many countries have, at some point, switched from household based to individual based taxation, the treatment of married couples as one unit continuous to be the case in many countries still today. When it comes to answering the main question posed in this literature-What share of all income is earned by some top group?-household based taxation creates a problem (albeit with clear boundaries, see Atkinson and Piketty (2007), p. 28-29, for an explanation of how to calculate these boundaries), but when it comes to studying the share of women in the top groups this in not possible without more detailed data about individual incomes.
This explains why the few recent papers, that have begun to answer questions about the presence of women in the top of the distribution, have focused on countries and time periods when men and women file taxes independently. Atkinson et al. (2018) study the share of women in top income groups, as well as differences in income composition, in eight countries with independent taxation 1 3 for men and women. They follow the methodology of the top income literature, in terms of defining the reference total for income and population, but then look separately at men and women in the different top groups. Piketty et al. (2018) report results for the share of women in top groups in the US since the early 1960s. However, given joint tax filing for married couples, they are restricted to differences stemming from labour earnings. Similarily, Garbinti et al. (2018) report the share of women in top fractiles of labour income in France starting in 1970. Boschini et al. (2017) study gender aspects of top incomes in the case of Sweden starting in the early 1970s when Sweden switched to independent taxation of men and women. Having access to panel data they are able to also study aspects of gender differences in top income mobility, individual characteristics and family structure of top income men and women respectively. Ravaska (2018) studies similar questions in the case of Finland starting in 1995.
These papers show that there are several important gender dimensions to top incomes. A first insight is that the share of women in the top of the distribution (at least in the countries studied) has grown steadily since at least the 1970s. However, they also show that the top is still far from equal in terms of gender. The share of women in the top10 group has roughly doubled since the 1970s but only to reach around 30%, and the higher up in the distribution we move, the lower is generally the share of women. Another important insight is that income compositions typically differ across genders. In several countries, other types of income than labour income, in particular capital income, make up a larger share of women's income as compared to men's. The time trend, however, is that the income compositions of men and women have mostly been converging. Using individual panel data, Boschini et al. (2017) also show that there are some differences in mobility (women are more likely to fall out of the top) and in some aspects of family characteristics. In particular, they show that the largest difference between top income men and women is not in terms of age or education, but in terms of marital status and partner income. While as many women as men in the top1 group nowadays have a partner, this was not the case a couple of decades ago. In the 1970s, almost all Swedish top1 men were married, in contrast to less than half of the top1 women. Moreover, among those top income individuals having a partner, most of the married top income women in Sweden are married to a man who also has a high income (and virtually none are married to men with low income). For top income men the reverse is true: most married top income men have a wife who is not a top income earner.
Together these papers suggest, first, that there are indeed interesting gender related developments in the top of the income distribution and that these are related to aspects highlighted in the top income literature, such as within top group differences and income composition. Second, the findings in Boschini et al. (2017) suggest that some gender differences in top incomes become apparent only when looking at longitudinal data and individual as well as family characteristics of top earners.

3
Women in the Top of the Income Distribution: What Can We Learn…

What Does it Mean to Study "Top Income Women"?
The fact of there being little research on gender aspects in the top income literature does not, of course, mean that there has been a general lack of interest in gender differences in the top of the income distribution. On the contrary, some of the most well-known results in gender economics, such as gender differences in executive compensation (e.g., Bertrand et al. 2010;Smith et al. 2013;Keloharju et al. 2016) and the so-called "glass-ceiling" results (Albrecht et al. 2003;Arulampalam et al. 2006;Albrecht et al. 2015) are explicitly about gender differences in the top. Recent work by Guvenen et al. (2014) study gender dimensions of top wage earners in the US, 1981-2014, and in an overview Marianne Bertrand (2018 summarizes the state of current knowledge in her introduction: "Despite decades of progress, women remain underrepresented in the upper part of the earnings distribution, a phenomenon often referred to as the 'glass ceiling'." It is important to note that these studies are primarily concerned with the top of the earnings distribution. In general, the focus in this literature has been on labour market outcomes, hence, excluding capital incomes, which are known to be important especially in the top of the income distribution. Moreover, the population is also typically restricted to the working age population and comparisons of wage gaps are usually made conditional on working full time. Often, when the focus is on detecting potential discrimination it is also natural to control for individual characteristics and sector, etc. For many questions these restrictions are, of course, perfectly sound and even necessary, but for others we may instead want to know the actual total income (from all sources) regardless of the choices underlying the outcome (such as labour supply) and without restrictions on the population. And importantly, gender dimensions may not be the same across these different comparisons.
The bottom line is that these different research strands illustrate how studying the role of "women in top incomes" can clearly mean many different things depending on the questions asked, and how these may overlap when thinking about gender differences in total income. In this paper, as in the top income literature, we focus on women in the top of the individual income distribution. 4 As far as possible, we include income from all sources before taxes and transfers. The reference population is ideally the full adult population (18+). Since we only focus on the gender composition of different top groups, and not on the income share of these groups, the reference total for income will be of less importance but in principle we would like this to be all incomes.

The LIS Data and Its Relation to Top Income Data
Recent work by Gornick et al. (2018) compares the coverage of top incomes in LIS to fiscal data (i.e. data from tax returns) used in the World Inequality Database (WID), for the entire tax population, which depending on the tax code of a country can be either households or individuals. Their study focuses on the US, but also contrasts results for Germany and France. They note that LIS allows them to match the (total) income concept used in WID and based on this they compare income shares of different top groups, as well as the different components (basically pre-tax labour, capital and business income). Their preliminary conclusion is that LIS and WID seem to give very similar answers up to the top1 group, but beyond this LIS seems to underestimate the total income accruing to the top group. This is mainly due to missing non-labour income in the top1 group. The finding confirms what has been noted before in the overview of the top income literature by Atkinson et al. (2011) and studied in more detail by, e.g., Burkhauser et al. (2012) for the US, and in Burkhauser et al. (2016) for Australia, namely that survey data tends to underestimate the incomes in the very top. 5 Our challenges are slightly different. We want to study the development of the share of women in top groups, with the top group being defined as the top of the individual distribution of total income from all sources (labour, business income, and capital) over the full adult population. As mentioned in the introduction this has previously been done in a few papers using the methodology and data typically used in the top income literature. Our aim is to see to what extent we can use LIS data to study this question and how our results relate to findings in these papers 6 .

Restrictions Used to Select Data in LIS
An obvious first consideration, well known from previous discussions about the pros and cons with survey data when studying top income share, has to do with requirements on the sample size to get reliable estimates, and related to this, the problem with top coding. LIS data are not top coded but the sample size varies making this a first limitation in terms of how many waves and countries can be included. 7 To illustrate the challenge of studying women in top groups using LIS data, consider the following back-of-the-envelope calculation: For a sample size of 50,000 individuals-a relatively large sample in LIS-the top1 percent consists of 500 individuals. Given

3
Women in the Top of the Income Distribution: What Can We Learn… what we know from previous work, our expectations on the share of women would range from below 10% to 20-30% at most. This translates into some 50-150 women in absolute numbers. Often samples in LIS are smaller. For a 20,000 sample, the absolute number of top income women would be 20-60 women. Clearly, any further study of characteristics of these women (education, employment, age profile, etc.) quickly brings us down to numbers where it is no longer meaningful to proceed.
Another concern is the coverage of all income sources. In principal, LIS data contains income from all sources (so constructing the equivalent to total income in the top income literature is possible), but all data is not available for all waves. This is especially the case for capital incomes. These are available for most countries in LIS only starting in 2007. Before that only Italy (from 1995), Germany (from in 2001), and the US (from 1979) have individual level capital income data. 8 We will contrast the share of women in top group for the labour income distribution, the sum of labour income and capital income, and the total income distribution whenever data is available. Throughout we use the incomes from the personal-level data (the P-file).
Based on these considerations, we restrict the LIS data samples in the following way. First, we limit our study to the adult population, leaving only individuals over 18 years old in the data. Second, to be able to follow the development of the share of women in top income groups over time, we restrict our study to countries with 5 or more years in the LIS data. This implies a loss of 15 out of the 49 countries covered in LIS. Third, we impose a restriction that the sample size must either exceed 0.0005 times the country's population, or consist of more than 20,000 observations. 9 This leaves us with 223 country-years of a total of 329 in LIS. 10 Finally, for the part of our analysis concerning partner income, we restrict the sample even further, and require at least 5 partnered women in the top1 percent of the labour income distribution for a country-year. In this part of the analysis, we are able to include only 8 countries and 80 country-years.
In order to construct the shares of women in top income groups for a particular country, we merge personal-level data files, and bottom-code negative income to zero, whether labour, total, or labour plus capital. For each country and each year, we then weight the observation by inflated population weights and obtain cut-off points for the joint income distribution of both men and women. We use these cutoff points to classify an individual as belonging to a particular percentile group of the income distribution focusing on the top10 and top1 groups.
For the part where we study of top income women and their partners, we rely on variables in LIS describing partnership, relation to the household head, and age. We classify an individual as having a partner if the LIS partner variable indicates him or her as having a partner (100), as living with a partner (110) or as not living with a partner (120). 11

The Development of the Share of Women in Top Income Groups: What Does LIS Data Show?
Given the minimum requirements on sample size we arrive at 28 countries for which we have relatively comparable observations of individual labour income (including self-employment income) since at least the early 1990s and in many cases since the late 1970s. For most countries we also have individual capital income starting in 2007, but only in three cases do we observe individual capital income before 2007, and only in one of these, the US, do we have both individual labour income and capital income starting in the late 1970s (the other two with individual capital income before 2007 are Germany and Italy). We start by looking at the share of women in top10 and top1 groups of the labour income distribution. Below the different country developments have been divided into five country groupings: Anglo-Saxon countries, Continental European countries, Scandinavian countries, Eastern European countries, and "rest-of-the-world" countries (this group consisting of Israel, Taiwan, Paraguay, and Mexico). The groups are similar to those in the overview of top income developments by Roine and Waldenström (2015), which in turn are based on common patterns noted already by Atkinson and Piketty (2007). 12 In addition, the grouping also makes sense from an economic gender inequality perspective. For instance, once men and women have equal rights in society, it is reasonable to, as a first step, group countries according to their extent of female labour force participation (FLFP). FLFP reflects not only how advanced a country is in terms of industrialization, but also parental leave and child care policies that enables mothers (traditionally the prime care takers) continuing to earn a wage after having children. The Scandinavian countries had already in the 1970s a high female labour force participation since the early expansion of high-quality child care enabled mothers to work while having small children. Another group of countries where mothers worked early was in Central and Eastern Europe, where these countries' joint Communist past have left a legacy of relatively high rate of FLFP. The Anglo-Saxon country group share a similar institutional set-up in terms of rather 1 3 Women in the Top of the Income Distribution: What Can We Learn… limited maternal leave benefits (and limited, if any, paternal leave provisions), and also characterized by high child care costs. 13 This institutional frame generally hampers the possibility for families with small children to be dual-earner households. Moving on to Continental Europe, most countries have had a relatively low FLFP rate in the past decades, except for France. The main difference between France and the rest of the continental group has been in France providing public day care solutions that allows parents to work full time. Finally, the four remaining countries in our sample (Israel, Mexico, Paraguay, and Taiwan) are not only geographically scattered but differ also from an economic gender inequality perspective. While Israel has a high rate of FLFP, Mexico is one of the countries in the OECD with the largest gender gap in employment. Therefore, we will refrain from making any groupspecific analysis of this group. For further updated details on the status of economic gender inequality, see OECD (2017). Figure 1 shows the development of the share of women in the top10 group (left) and the top1 group (right) for the five English-speaking countries in our sample. The overall picture is clear. The share of women is far from equal to that of men, but it has at least doubled in the two top groups since the early 1980s, from low levels to around 25-30% women in P90-100 and to around 15-20% in the P99-100. Figure 2 shows the same development for nine continental European countries. The overall trend is similar; growth of the share of women in the order of a doubling (or tripling) since the 1980s, and levels in the most recent waves around an average of 25-30% in the top10 and 15-20% in the top1 group. A noticeable, interesting 13 See Adema et al. (2015) for more details. difference here is that the spread is larger with some countries, like Spain, France and Greece, being at or above 30% women in the top10 group, while countries such as the Netherlands, Switzerland and Germany are around 20 or below.
Scandinavian countries also display similar trends and interestingly enough these countries, known to be comparatively gender equal, do not display higher shares of women in the top-see Fig. 3.  If anything, the values at least for Denmark, Norway and Sweden are low compared to other countries (as also shown in Boschini et al. 2017, and discussed in Boschini and Gunnarsson 2018).
Looking at the group of Central and East European countries in our sampleshown in Fig. 4-we note that they display on average higher levels of women in top income groups. This is well in line with the legacy of former communist countries being relatively gender equal, at least when it comes to labour market participation. Figure 5 shows that the trend has been remarkably similar for as different countries as Israel, Mexico, Paraguay and Taiwan. The share of women in the top10 group has increased over time from low levels to around 30% in 2015, while the share of women in the top1 has experienced a less pronounced positive development so as to arrive at around 20% in 2015.
Overall, there seems to be a relatively common trend across countries and the orders of magnitude are also relatively similar. In general, women's share of the top of the labour income distribution has increased a lot since the 1980s. There is also quite consistently a fanning out of the share of women, in the sense that there are consistently fewer women higher up in the distribution. But there are also some interesting differences across country groups, most notably the Central and Eastern European countries having the highest shares of women both in top10 and top1 (and also relatively similar in the top10 and top1 groups). The relatively low share of women in top1 in the Nordics is also notable.
The differences across country groups are in many ways in line with the results in the gender economics literature. Research in recent decades has shown that there is a complex relation between female labour force participation and the varying gender gaps across the earnings distribution, beyond the relevance of family policies-more about this in Sect. 4.2. below. On average, the mean gender gap has been shown to be negatively correlated with the rate of FLFP, which might at first appear surprising. But as shown in Olivetti and Petrongolo (2008), the European countries with a low gender wage gap tend to have strongly selected FLFP in that women in those countries either work out of necessity to support themselves or to pursue their careers (and have the means to arrange for private child care). As also emphasized in Arulampalam et al. (2006) the countries in Europe having a generous welfare state in terms of child care provision tend to have high glass-ceilings. The suggested mechanism goes through part-time work: Since mothers tend to retain the lion's share of the responsibility for children, publicly provided child care enables them to be on the formal labour market, but only part-time-see e.g. Olivetti and Petrongolo (2017) and Johnsen and Løken (2015) for a more in-depth account of the mechanisms. Another way of capturing women's top earnings potential is to analyse the share of female managers, which vary considerably over country groups. These patterns match the extent of glass-ceilings, with Anglo-Saxon countries, and other countries with low public provision of child care having a higher fraction or at least the same fraction of female managers as the Scandinavian countries.
In relation to these patterns the high levels of women in top groups in Eastern Europe are interesting with their history of high labour force participation and a state socialist legacy in terms of less traditional gender-roles in the population (that remain today) with respect to work (see Campa and Serafinelli 2018). The share of female managers in these countries is also comparatively high. 14  Figure 6 shows the average share of women in the top10 and top1 in the labour income distribution across the five country groups. The average patterns are surprisingly similar with the Eastern European observations being the positive outliers, and with the Continental European and Nordic countries displaying the lowest shares.

Differences Between Labour Income, Labour Plus Capital Income, and Total Income
As mentioned above, an important finding in the top income literature, especially historically, is that top income shares and their development depends a lot on capital incomes. Several papers studying long-run developments show that much of the decline of top shares in the first half of the century turn out to be driven by diminishing capital incomes in the top, while the top share of the wage bill in this period did not change much. In recent decades the picture is more mixed. This, of course, raises the question to what extent the share of women in top groups vary when including capital. As also noted above, LIS data has some important limitations with respect to this. Only after 2007 do we observe individual capital income for most countries that we study. 15 This means that for all observations after Fig. 6 Average share of women in top groups in the labour income distribution for the five country groups this we can compare the share of women in the top1 and top10 in the distribution of total income (now always including capital income) and corresponding shares in the labour income distribution. Figure 7 plots these share against one another. 16 The shares correspond surprisingly well to each other, and there is no clear pattern in over-or understating the share of women depending of which distribution is used. If anything, there appears to be a tendency of the top1 share of women being somewhat more sensitive to the income measure used. In 29 out of the 64 observations, the top1 share of women is larger in the labour income distribution, with the maximum difference being 0.05 percentage points. In the remainder 35 observations, the maximum understatement is 0.04 percentage points when using the labour income distribution instead of the total income distribution. Overall, it seems the top share of women in the labour income distribution serves as a good proxy for the top share of women in the total income distribution for this period and for this set of countries suggesting the income composition is not too different between men and women.
For the few countries where we have longer time series including capital incomes, however, we do see some differences when going back in time. Figure 8 illustrates this for the United States, for which the longest time series of capital income in LIS is available. It shows the share of women in top10 (left) and top1 (right) of the labour income distribution, of the total income distribution and in the distribution of the sum of capital and labour income. It suggests that the share of women in the top10 is consistently a few percentage points lower in the labour income distribution up until recent years when the difference is basically gone. The same is the case, but even more pronounced, for women in the top1 group. In line with the suggestive evidence in Edlund and Kopzuk (2009), the top1 share of women is higher in the 1980s in the US when using measures of total income (including capital) rather than labour income.
In "Appendix B" we present analogous graphs for all countries in our sample, and it appears as though the discrepancy between top1 share of women in the labour income distribution and in the total income distribution is rather limited in recent years. 17

Comparing LIS to Top Income Results on Women in Top Shares
Given the issues discussed above, it is interesting to see how the overall results compare to the previous studies done on countries where individual level income tax data is available for the full population (or in some cases larger samples). In Fig. 9 we display our series (bold lines, solid for top10, dashed for top1) over the series 17 Total income here is the total income concept defined in LIS. Given the difference in, for example, coverage of capital income these are not fully comparable over time.  Atkinson et al. (2018) and series for Sweden in Boschini et al. (2017). The overall trends and levels are very similar but there are also some important differences especially for individual years where the LIS-based series fluctuate. The clearest discrepancy is Australia where LIS data shows a decline in the share of women in the top1 group starting around 1995 to a level below 10 percent. This pattern is very different from what is found in Atkinson et al. (2018) where the trend is positive throughout and the level in the most recent year is above 20%. Canada also shows a similar discrepancy, though not as marked.
Our overall interpretation is that even the shares for top1 in LIS can be taken at least as suggestive for how large the share of women is in the top and also as giving a reasonably accurate picture of the long run trend. One should not, however, interpret individual year fluctuations, as these might just as well be a result of having a small sample. This is in line with what is found in the comparison between LIS and WID in Gornick et al. (2018). But when it comes to studying finer details of gender differences within the top1 group this becomes too demanding given the sample nature for most LIS data.

What Determines the Share of Women in Top Income Groups?
A large literature has studied various mechanisms that could explain the lack of women in the top of the distribution, as well as the cross-country and time differences in this. Several papers, however, suggest that the women in the top of the labour income distribution are relatively similar to their male peers in terms of education, occupations and sector; if anything, women tend to be somewhat younger-Bertrand et al. (2017), Guvenen et al. (2014), Keloharju et al. (2016). A potential explanation for the relative lack of top income women is that they often have been faced with having to choose between having a family and a career. Those women choosing the career path also tended to become childless to a greater degree-see e.g. Goldin et al. (2006), Bertrand et al. (2010), Boschini et al. (2011), Goldin (2014. 18 Another type of explanation has to do with the impact of children for those who have them. Angelov et al. (2016), Kleven et al. (2018) and Keloharju et al. (2018) show for different Nordic samples, that the event of the first child severely hampers women's wages and future careers compared to those of their male partners. Kleven et al. (2019) show that this extends to Germany, Austria, the UK and the US as well. In short, the nexus between having a career, a marriage and children is highly complex for women (and increasingly for men), since both partnership and fertility decisions are endogenous to succeeding in the labour market. Moreover, as argued already by e.g. Mincer and Polachek (1974) and Becker (1985), women's educational choice could also in the first place be endogenous to women's aspirations of being the prime caretaker later in life. In this paper we disregard these complexities and limit ourselves to explore the descriptive relation between income status and educational level, marriage and fertility (in terms of parity).
Guided by some of these previous papers we will in turn look at, first, a linear probability model of how the likelihood of men and women being in the respective top groups is associated with variables such as age, education, having a partner, having one or more children. Second, we look at the association between social norms concerning if women should stay at home when children are young and how this relates to the share of women in top groups. Third, we study asymmetries in how the likelihood of being in the top income group is related to having a partner, as well as-conditional on having a partner-where in the income distribution this partner is located.

Gender Asymmetries in the Probability of Being a Top Earner
To explore some key associations, and in particular the difference between men and women, we run a linear probability model to study how the likelihood of being in the top10 and top1 groups, respectively, is related to a number of socio-economic factors, controlling for country and time fixed effects. 19 Table 1 shows the results for a pooled regression for all countries and years in our four main groups (Anglo-Saxon, Continental Europe, Central and Eastern Europe, the Nordic countries, and the residual group consisting of Israel, Mexico, Paraguay and Taiwan). As one would expect age is positively associated with the likelihood of being in the top group but decreasingly so (age squared being negative) for both men and women, education is also positively associated with being in the top groups, and the size of the coefficient is larger for higher education as compared to medium education (with low education being the base case). These effects are strongly statistically significant and larger in magnitude for men than for women. Having a partner is associated with an increased probability of appearing in the top income groups for men, while the reverse is true for women. Finally, the effects associated with children are also different for men and women. For men, having children is associated with a larger probability of being in the top, and more so for more children. For women, however, the reverse is true. This is in line with the findings in previous literature-e.g. Goldin (1997Goldin ( , 2006 and Boschini et al. (2011)-where until recently women in practice had to choose between pursuing a career and children, while male fertility increased with economic success.
Pooling the data like this, of course, runs the risk of potentially missing a lot of cross country (or country group) variation. However, when we run the same regression for the individual country groups the effects turn out to be surprisingly similar. Figure 10 illustrates the coefficients when running the above regression for the four country groups separately. To the left are point estimates for men, to the right those for women (P90 above, P99 below). With few exceptions the estimates are very similar (full regression tables are in "Appendix C)".
While the above regressions include country and time fixed effects we have not explored the potential that some variables may have different effects over time. To do so, we interact the indicators for number of children and education with time dummies (one for each wave of LIS data) and repeat the analysis above separately for men and women. Coefficients remain qualitatively the same, but the added interaction effects suggest that the differences between men and women are somewhat decreasing over time. For men, the 'child premium', observed earlier in both P90 and P99 groups, tends to decrease over time for all number of children as the interaction terms are negative and increase in magnitude. Similarly, the 'child penalty' observed for women is also decreasing over time, as the interaction terms remain positive and increase in magnitude. The effects for education categories, shown in Fig. 11, are similar. For men in both income groups, the education premium they enjoyed compared to women seems to wane over time, especially so for men in P90. For women, the effect remains at a reasonably constant level throughout the period.
Overall, the results from the pooled regressions including time interactions are again surprisingly similar if we study groups of countries separately, with a notable exception in the Scandinavian countries, where the 'child penalty' for women was barely present to begin with (rather, the magnitude of premium for children was smaller for women, as compared to men). Of course, one should be careful interpreting these results, as the interaction terms do not represent the marginal effects and cannot be interpreted directly. 20 More importantly, these correlations are all purely 1 3 Women in the Top of the Income Distribution: What Can We Learn… Table 1 Linear probability regression of the likelihood of being in a top group for the pooled sample of all observations in the main country groups Women in the Top of the Income Distribution: What Can We Learn… descriptive and careful econometric modelling-beyond the scope of this paper-is required to shed light on any potential causal mechanisms.

Social Norms and the Share of Women in Top Income Groups
It has also been suggested that conservative social norms prescribing motherhood rather than career ambitions for women might be holding women back-see Bertrand (2011) and Ponthieux and Meurs (2015) for excellent overviews. Using questions in the World Value Survey on gender roles, Fortin (2005) indicates that there is a close relationship at the country level between female labour force participation and the share of the population agreeing with the statement "Being a housewife is just as fulfilling as working for pay". Moreover, in a recent paper Kleven et al. (2019) suggest that social norms might be even more important than changes in social policies for the wage penalties that mothers experience on the labour market. They measure social norms by the share of the population agreeing to that mothers should work full-time or part-time outside the home when having small children or school-age children using data from the International Social Survey Program (ISSP). 21 To explore the extent to which social norms might influence the share of women in top income groups, we construct a similar variable as that used by Kleven et al. (2019) from the 2012 wave of ISSP. More precisely, our social norm variable captures the share of the respondents that agree to the statement that "Women should work full-time outside the home when there is a child under school age". The underlying assumption is that to be a high-income earning woman (or man) you need to be working full-time most years (even if you have children under school age). Figure 12 plots the correlation between this measure of social norms about working mothers with small children and the share of women in the top10 and the top1 groups of the total income distribution in 2012 (wave 9 in LIS).

Top Income Earners and Their Partners
As our regression results in Sect. 4.1 suggested, living with a partner is, on average, more likely for men than for women in our top groups. Given the increased female labour force participation and relatively lower gender wage gaps over the last decades, it becomes particularly interesting to explore how the share of the top men and top women with a partner has evolved. Moreover, given the increase in dual earnerhouseholds in general, could it be the case that more women in the top have partners that also in their own right belong to the top groups? Studying partner choice and family composition in the top income groups with LIS data is challenging since the sample sizes are relatively small, and we easily end up with too few observations for it to be meaningful to conduct any analysis. Not only are there few women in the top group to start with, we also have the additional fact that not all of them have partners (and this has changed over time). In order to have longer time series, we focus on the top men and top women in the labour income distribution. Figure 13 summarizes by country groups and waves, the share of top earners having a partner. ("Appendix D" contains the corresponding graphs by single countries.) First of all, we notice that the share of top men and women having a partner, on average, has converged over time in the top10 group of the labour income distribution. Especially in the Scandinavian countries in recent years, top10 men and women are almost as likely to have a partner. Back in the 1980s (for those countries where there is LIS data) only around half of the top women had a partner, compared to at least 90 per cent of the top men. Across top1 men and top1 women, there is a slight tendency towards convergence in that some more top1 women on average have a partner at the same time as somewhat less top1 men do not have a partner in recent waves. But the difference is still today at least 10 percentage points in share of top1 men and women that have a partner.
Yet another type of difference that has received relatively less attention until recently is the possibility that the choice of partner and asymmetries in relationships may matter a lot especially for top income women. There is, of course, a large literature on assortative mating (see Greenwood et al. 2017 for an overview) in general. But of more specific concern for the top, Bertrand et al. (2017) suggest that there is a norm prescribing women to earn less than their partners, and Folke and Rickne (2019) indicate that successful career women are likely to face a divorce as a consequence of their promotions. Our LIS sample can give an indication of whether top persons' partners also are top income earners or not. 22 However, the number of countries having a sufficient number of partnered women in the top during several waves is slim. Canada, Denmark, Finland, Norway and the US turn out to be the only countries having large enough samples for such an exercise to be meaningful. Figure 14 shows where in the labour income distribution the partners of the top men and top women are, when we pool the share of partners over these five countries, allowing for the partner to be either in P0-90, P90-99 or in P99-100. ("Appendix E" 22 In a related paper Aaberge et al. (2018a, b) use LIS data to study assortative mating in relation to "perfect matching" and random matching, respectively.

Fig. 13
Share with partner in top10 and top1, women and men, for the respective country groups contains the corresponding country-specific graphs for Canada, Denmark, Finland, Norway and the US.) What stands out is that consistently a vast majority of top men, regardless of whether they are in top1 or top10 of the labour income distribution, have a partner that is not in the top10. For top women, that is not the pattern at all. To the extent that top women have a partner, they are more likely than men to have a high earning partner. These patterns are relatively stable over time. This is consistent with the patterns found for Sweden with register data by Boschini et al. (2017).

Summary and Concluding Remarks
This paper has explored what we can learn about gender differences in top incomes using LIS data. Overall, the main limitation lies in that the samples are rarely sufficiently large to allow careful study of women in the top group. This is, as shown above, especially true when one is interested in questions that would require a further division of characteristics of say the top1 group. Nevertheless, we think that our findings provide some novel insights.
First, we think that our overall trends for the share of women in the top10 and top1 groups give a reasonably accurate picture of both the level as well as the trend of this development since the 1980s. This is confirmed by comparing our results in this paper to previous findings that study a smaller number of countries (eight) but where the studies have used data on the full population (or very large samples). Studying 28 countries we provide series that suggest that while women's share in top groups have increased (roughly doubled) over the past three, four decades, the representation of women in the top is still far from equal. The share of women in the top10 is around 25-30% in most countries, the max being above 40% in Slovenia, and the minimum being around 15 in Switzerland.
Second, for recent years when we can compare the shares of women in top groups across distributions of labour income and total income (including capital income) we find that the shares are not affected much (the maximum deviation being 4 percentage points, but most observations being very similar). Also, there is no clear pattern in over-or understating the share of women depending on which distribution is used. However, for the US we can analyse this starting in 1979 and here we find a marked difference in that women's share in top1 was much higher when including capital income in the income concept, suggesting that even if recent observations show small differences depending on the income concept this is not necessarily representative for historical periods.
Third, regression results suggest that many associations between socio-demographic variables and the probability of being in the top income groups are different for men and women. Having higher education is positive for both, but more so for men. Interestingly, variables such as having a partner and having children have opposite signs for men and women. These family related variables are negatively associated with the probability of being in the income top for women while the opposite is the case for men. We also find that social norms, such as having a positive attitude to women working when children are young, are positively related to the share of top income women. Finally, when looking at income characteristics of the partners of top income earners, we find that top income men are much more likely to have a partner who is not in the top income groups, while this is not the case for women. Such asymmetries are likely to impact the ease of focusing on a top career differently between men and women.
Even though samples are small, the cases where it is feasible to study this show very clear consistent patterns to this effect across all countries in our data. Hopefully future studies on larger samples will be able to shed more light on this question.
The last point illustrates an important general gap to be filled in future studies of top income men and women. On the one hand, we need large samples (or preferably the full population) to observe sufficiently many top income individuals to study their characteristics, but we also need information on their family and household characteristics to fully understand who succeeds and who does not. Identical individuals in terms of observable individual characteristics may have very different family situations, with important consequences for their individual success. Understanding these interactions seem like important avenues for future research.     To gauge the number of women observed in the top10 group, recall that with the same share of women in top10 and top1 it would be ten times as large, but given that the share of women in the top 10 group is typically 1.5-2 times the share in top1, it is more 15-20 times the above number Women in the Top of the Income Distribution: What Can We Learn…     Table 4 continued   Table 7 continued