Documenting occupational sorting by gender in the UK across three cohorts: does a grand convergence rely on societal movements?

We consider the extent to which temporal shifts have been responsible for an increased tendency for females to sort into traditionally male roles over time, versus childhood factors. Drawing on three cohort studies, which follow individuals born in the UK in 1958, 1970 and 2000, we compare the shift in the tendency of females in these cohorts to sort into traditionally male roles compared to males, to the combined effect of a large set of childhood variables. For all three cohorts, we find strong evidence of sorting along gendered lines, which has decreased over time, yet there is no erosion of the gender gap in the tendency to sort into occupations with the highest share of males. Within the cohort, we find little evidence that childhood variables change the tendency for females of either the average or highest ability to sort substantively differently. Our work is highly suggestive that temporal shifts are what matter in determining the differential gendered sorting patterns we have seen over the last number of decades, and also those that remain today. These temporal changes include attitudinal changes, technology advances, policy changes and economic shifts.


Background
For decades, economists have contributed to the literature that seeks to explain the gender wage gap. 1 A well-accepted conclusion is that the lack of women in highpaying, male-dominated professions is one major cause of this gap (Bayard et al. 2003;Goldin 2014;Blau and Kahn 2016). This has led to a search for the underlying causes of gender-based sorting. Explanations include differential human capital investments, (Altonji and Blank 1999), discrimination (Becker 1957), a lack of flexibility to combine a career and family in male-dominated jobs (Goldin 2014;Bertrand 2018) and differences in tastes and preferences (Lordan and Pischke 2022;Cortés and Pan 2017). In this work, we build on this literature and consider the extent to which temporal shifts have been responsible for an increased tendency for females to sort into traditionally male roles over time, versus childhood factors that have already been shown by economists to shape future successes of children in other life domains. 2 Considering that previous works have already demonstrated that gender gaps in childhood skills are key determinants for different occupational choices as well as college major choices between men and women, 3 our goal here is to examine the extent that gender predicts sorting patterns across time and ask whether this is moderated by childhood factors, such as early health, socio-economic status, parental investments and aspirations, characteristics of schooling and the child's own ability. 4 The importance of large temporal movements for gendered sorting is intuitive. A convincing narrative points to the fact that until the 1970s, there were many more maledominated roles, than today. Between 1970 and 2018, some stylised facts emerged. First, females sorted into many occupations that were traditionally male-dominated. Examples include law, accountancy and pharmacy. Second, females failed to converge into other occupations such as those requiring science, technology and engineering where the share of males in the UK, US and across the EU still exceed 80%. 5 Third, the revolution has been asymmetric-with males failing to sort into traditionally female occupations, such as social work, nursing and primary school teaching. These stylised facts are also visible in Fig. 1, which plots the occupations of three cohorts born in  (1970), share of men in an occupation is calculated from the occupation cohort member held at age 33 and 34 years old, respectively. For the MCS (2000), the occupation is the occupation cohort members (at age 11) aspired to be when they turn 30 years old 1958, 1970 and 2000 in the UK, respectively. 6 There is a downward trend in the share of males in law over time, but no real change in engineering. The proportion of males in nursing is flat over the three periods, with the share of males planning to go into teaching decreasing to even lower levels for the most recent cohort. This highlights that the asymmetric gender revolution remains in place. Gendered sorting is intuitively influenced by temporal movements over time. For example, human capital investments, both in type and quantity, are affected by social norms. Systemic changes in attitudes over time can cause females to invest in different career paths if preferences are shaped by these norms. For example, the tolerance of discrimination has changed radically over the last 5 decades. 7 A systemic movement which causes males and females to share family responsibilities more equally removes constraints for females allowing them to have wider career choices. The influence of tastes and preferences on gendered sorting is also now being explored (Filer 1986;Lordan and Pischke 2022;Cortés and Pan 2017).
In addition, across their lifecycle, males and females interact with these social norms, updating their preferences, beliefs and decisions based on social norms. For example, one cause that has been identified as women having lower pay and access to senior leadership roles is their tendency to negotiate less as compared to men (Biasi and Sarsons 2021;Dreber et al. 2022). However, there is also evidence that women may choose not to negotiate as they are aware of the social expectations towards them 6 These cohort data are used in this paper and are subsequently described. For the 1958 and 1970 cohorts, occupations are measured based on data collected when they were in their early 30s. For the cohort born in 2000, occupations are measured based on aspirations reported at age 12. 7 Becker (1985), Katz and Murphy (1992) and Goldin (2006) have all suggested that the effects of gender discrimination are now much less relevant than other factors when it comes to explaining occupational segregation as compared to previous time periods. and fear societal backlash (Kray and Gelfand 2009); as well as differential treatment when they do (Bowles et al. 2007)-causing them to negotiate less in the future. It is easy to imagine these dynamics playing out across a range of behaviours of both males and females, conditioning or resigning individuals to behave more in line with social norms in their future interactions.
While these papers suggest that tastes and preferences have a role to play in occupational sorting, they may be socially constructed. This fits with the idea that individual decisions are influenced by the opinions of others, which ultimately shape identity (Akerlof and Kranton 2000). Temporal changes in sorting patterns can also arise because the tasks within an occupation have changed over time, making them less physically demanding, and deemed more suitable for women (Black and Spitz-Oener 2010;Yamaguchi 2018, Alesina et al. 2013). In addition, temporal changes in sorting patterns can arise because of technology shifts. For example, Goldin and Katz (2002) demonstrate the role of the contraceptive pill, by allowing delayed fertility, on women's career choices. Major policy changes have also been shown to affect sorting by gender, such as the introduction of credentials through the use of degrees and licenses and licences in the US in the 1970s and 1980s, which gave objective signals for women to demonstrate their suitability for jobs that previously had low shares of males.
At a more micro-level, gendered sorting has the potential to be influenced by childhood variables given that experiences in that period vary by gender. So, why do experiences vary? First, some people may have preferences for a particular gender or indeed have a preference for engaging children in different activities depending on whether they are a boy or a girl. 8 Second, people may hold certain beliefsthat boys and girls have different production functions and as a result, decide to engage children in different gendered activities. 9 This concept also corresponds to a theory of biased choices (see, for instance, Berger et al. 1966;Oxoby 2002) whereby parents or children themselves make decisions that are conformed and biased by their hierarchically social characteristics. Third, there may be differential monetary and/or opportunity costs of engaging with boys over girls in specific activities. These explanations are not mutually exclusive, but together, can largely capture the underlying causes of why boys and girls are exposed to different experiences that ultimately may shape their futures. Examples of differential treatment by gender abound in the literature. 10 Differential treatment has the potential to impact cognitive development, including soft skills and motor skills.
We are interested in the extent to which these experiences that vary within cohorts, change gendered sorting patterns as compared to systemic changes, which occur across cohorts. Drawing on three British cohort studies, which follow children born in Britain in 1958, 1970 and 2000, we compare the generational shift in the tendency of females to sort into traditionally male roles, to the combined influence of a large set of childhood variables. We consider an exhaustive enough set of childhood variables that can be reasonably expected to be correlated with both gender and sorting patterns. These childhood variables capture cognitive, soft and motor skills, alongside socio-economic variables, health status, parental influences and peer influences. We consider several proxies that capture the differential aspects of traditionally male jobs. For those individuals born in 1958 and 1970, these proxies are based on their occupations in their early 30s. For the individuals born in 2000, the proxies are based on their aspirations for the future (when they turn 30 years old). These proxies range from the share of males in an occupation to variables which capture an occupation's content. We acknowledge that we do not capture a universe of childhood variables such that we can rule out every childhood factor that can possibly determine occupational sorting. However, our analysis does capture a large array of childhood variables-allowing us to consider many more measures of early life skills and preferences as inputs into the gender occupational process commonly considered in this literature.
Several stylised facts emerge from our analyses. First, for all three cohorts, we find strong evidence of sorting along gendered lines, regardless of the childhood variables that we include in our regressions. Second, the tendency to sort along gendered lines has decreased substantively over time. That is, the gender gap has narrowed across birth cohorts. Conversely, we find little evidence that childhood variables can considerably change the tendency for an average female to sort substantially differently within a cohort. Third, the same conclusions emerge even if we focus only on individuals with the highest childhood cognitive ability. These are the individuals who we expect to be able to move against gender stereotypes and subsequently sort into the top jobs. We view our work as underlining the importance of temporal shifts, over and above the role of within-cohort childhood variables that are observable in our samples, in determining the changes in gendered sorting patterns over the last number of decades, and also those that remain today. However, our work cannot pinpoint what exact temporal shift did the heavy lifting.
In all likelihood, it is a variety of structural shifts at the level of society that contributed to the trends that we observe today. This includes attitudinal, economic, policy and technological changes; as well as how quickly individuals interact with these shifts so that tipping points are reached. Other potential forces at play are the changing nature of tasks in many occupations, lower tolerance levels towards discrimination in the workplace, differential growth patterns across industries and the ability for women to delay their fertility.

Analytical framework and empirical design
We are interested in the extent to which a female's tendency to pursue traditionally male roles is influenced by childhood factors versus unobservable underlying factors that reflect societal shifts. The outcome variables considered are proxies for aspects of work where we expect the sexes will bifurcate in sorting tendencies. These proxies cover income, work hours, flexibility, job content and job competitiveness alongside the share of males in an occupation. Drawing on individual-level data for cohorts of individuals born in the UK in 1958, 1970 and 2000, we relate each proxy in turn to a female dummy variable, and sequentially add groups of childhood variables, which we may expect to be correlated with both gender and the proxy. We take a holistic approach to specify these childhood factors and examine demographic and socio-economic variables from early childhood, alongside measures of cognitive and non-cognitive ability, childhood health, parental inputs and external influences.
We are interested in the extent to which the coefficient of the female dummy is attenuated with the addition of these childhood factors. If the coefficient is attenuated substantively, we argue that it reveals that malleable factors at the individual level, of which many can be readily influenced by parents, schools and policymakers, play a large role in determining gendered sorting across generations for an average female. However, if the female coefficient remains relatively stable within each birth cohort, it is highly suggestive that such childhood factors may matter little in explaining gendered sorting. 11 Rather, a relatively stable female coefficient within a given birth cohort, but a declining female coefficient across cohorts would therefore points in the direction of the importance of temporal movements in changing gendered sorting patterns.
Specifically, we estimate: is a proxy for a component of the job, j, for individual, i, in adulthood.
F i is equal to 1 if the individual is female and 0 otherwise. X child i is a vector of childhood control variables, which will be subsequently discussed. We run Eq. (1) separately for each cohort and obtain the estimate of δ, which indicates the extent of being a female influences occupational sorting across cohorts. We note that we care most about how δ changes within as well as across cohorts when we sequentially add childhood variables. In this exercise, we do not seek to put structural interpretations on the coefficients of control variables, β. However, if F i is attenuated by a particular set of X child i , we view this as suggestive evidence that something that is correlated with this same X child i is driving gender sorting within the cohort. Hence, childhood variables, in the general sense, matter in determining sorting.
In addition, we will check whether the same general patterns identified by Eq. (1) hold for the most intelligent children in the three cohorts. After all, these are the individuals we would expect to be most likely to reach the most prestigious jobs in society, at which we may care more about having a better representation of women. It also helps abate concerns that any effects found are owed to unobserved individual differences. 12 To consider this, we follow the psychometric literature and use exploratory factor analysis to reduce the dimensionality of our proxies for childhood intelligence in each cohort study into one variable (Gorsuch 1983;Thompson 2004). 13 From this factor, we repeat the analysis documented above on individuals who are in the top quantile of this distribution only. We note that the female share in this decile is 51. 9%, 49.4% and 49.12% for children born in 1958, 1970 and 2000, respectively. Note, however, that, an issue with estimating Eq. (1) is that there is a risk of overfitting, given that our estimation includes a large number of childhood variables. To mitigate this, we also estimate Eq. (1) by applying LASSO regression analysis. 14 For our purposes, LASSO is useful as a check as to whether the coefficient, δ, on the female dummy remains nonzero after the shrinkage process, emphasising that of all the variables included in the regression it is one of the most strongly associated with the outcome. We report only δ from the LASSO regressions in the main text and document the full results in Appendix D.
The first outcome we consider is the share of males in an individual's chosen occupation. This allows us to ask directly whether occupational segregation by gender has changed substantively for the three cohorts, and how these changes relate to the childhood variables we usually think of as determining a person's future that are commonly collected in surveys. We complement this with regressions that model the probability that a job with a share of males 80% or more is chosen, to allow us to quantify how this has changed over the three generations. We also consider an outcome that is equal to 1 if a female has opted out of the labour force and zero otherwise. This is our only outcome that is generated at the individual level, rather than the occupation level.
The next proxy we consider is average occupational income. For many years, economists have contributed to the literature that seeks to explain just why there is a gender wage gap. It is now clear that the lack of women in high-paying, maledominated professions contributes significantly to this gap. 15 Therefore, considering the average income of an individual's chosen occupation as an outcome in Eq. (1) allows us to examine directly whether females have been choosing jobs with a significantly higher average income over time and how this is mediated by childhood variables.
We also consider the average hours in an individual's occupation as an outcome. Women who find it hard to juggle family and children-or indeed hard to imagine 12 This concern may arise if females working in jobs with the highest shares of males have above average unobserved individual ability (see Ludsteck (2014) for evidence in this regard for Germany). 13 A clear structure of one latent factor emerges in the first rotation (see Appendix A.2). 14 LASSO is a shrinkage and variable selection method for linear regression models whose goal is to obtain the subset of predictors that minimizes prediction error for a quantitative response variable. The LASSO does this by imposing a constraint on the model parameters that causes regression coefficients for some variables to shrink toward zero. Variables with nonzero regression coefficients variables are most strongly associated with the response variable. LASSO is useful given that many of the variables we sequentially add are highly correlated, so disentangling their true coefficient size given potential issues with multicollinearity is difficult. 15 See, for example, Blau (1977), Bielby and Baron (1984), Macpherson and Hirsch (1995), Carrington and Troske (1998), Bayard et al. (2003), and Blau and Kahn (2016). juggling in the future-may 'opt out' of occupations that make this more difficult. This suggests a constrained choice. Therefore, we re-estimate Eq. (1) with average hours as the dependent variable. δ is then indicative of how important it is for females to be in occupations with lower average hours as compared to males. This fits with work that suggests that females 'opt elsewhere,' choosing occupations that allow them to accommodate family responsibilities (Polachek 1981;Belkin 2003;Stone 2007) or choose to work fewer hours to balance family responsibilities (Antecol 2010). 16 We complement the average hours proxy with another variable which captures nonlinear returns to hours worked. This follows Goldin (2014) who presents evidence for full-time college graduate workers in 95 high-paying occupations. Goldin's metric for the flexibility of an occupation is the elasticity of individual earnings with respect to hours worked: high elasticities imply a penalty for workers seeking short hours and indicate a lack of flexibility. Goldin (2014) demonstrates that less flexible occupations have a larger pay gap, and it is also argued that less flexible work pushes females with children out of the labour market (Leber Herr and Wolfram 2012).
Our remaining proxies capture occupational content. This complements a recent emergence of explanations for occupational segregation which suggest that males and females have different tastes when it comes to the content of the work that they do. The psychologist Susan Pinker (2008) has pushed the idea that differences in the preferences of women and men are the main driver of gendered labour market choices. Pinker (2008) highlights that women may not like the nature of male-dominated jobs, preferring 'people' content over making 'things' (what we refer to as 'brawn' hereafter). Lordan and Pischke (2022) provide quantitative evidence from three countries and a discrete choice experiment which backs up this claim. Overall, their work suggests that females are more extrinsically motivated opting for jobs that are high in 'people' and 'brains' content, like medicine and law, over jobs that are relatively high in 'brawn' content, like engineering. By contrast, males care less about the job content. Cortés and Pan (2017) investigate the predictive power of a variety of occupational indices in regressions that model the rate of females in an individual's occupation. They show that social contribution and physical skill dominate. Broadly, these findings lend support to the 'people' versus 'brawn' divide.
Together, these results raise the question of whether on average women prefer jobs with a societal contribution. These differences in tastes by gender can be innate, evolutionary or socialised. 17 Evidence that females differentially select into work of different content, but with a pattern changing over time, point to a temporal role in the formation of preferences. That is, what is going on in society influences sorting 16 We note that a number of other studies have also considered whether 'opt out' of the labour market occurs conditional on having children and do not find any differences by education level (Boushey 2005;Goldin 2006, Vere 2007Cohany and Sok 2007;Fortin 2008;Percheski 2008). For us, females do not necessarily opt out. Rather they may 'opt elsewhere' to allow them to better manage their current or expected family commitments, into occupations with lower average hours and enhanced flexibility. This is supported by Xie and Kimberlee (2004) who highlight that marriage and children move women from the male dominated fields of science and engineering towards other types of work. 17 Evidence is also provided by Grove et al. (2011) who examine the pay gap of MBA graduates and find that female MBAs have a wage penalty owed to choosing occupations that contribute to society and have high ethical standards. Su et al. (2009) also emphasis sex differences in occupations preferences in an overview of the psychology literature on this topic. patterns. To check this, we also estimate the model of Eq. (1). We create three proxies for job context, based on an approach introduced by Lordan and Pischke (2022) and drawn on ONET activities and job content data. Overall, these proxies represent 'people, ' 'brains' and 'brawn' content. 18 Competitiveness as another measure of job characteristic is also important. Experimental evidence has highlighted that females are more averse to competition as compared to males (Croson and Gneezy 2009). 19 Therefore, we also aim to check the extent to which gender explains an individual's decision to enter and stay in an occupation, which has a highly competitive environment where the stakes are high. Therefore, our last proxy is a measure of competitiveness at the occupational level from the ONET database. This proxy is built on Pan (2016, 2017) who use the same measure of competitiveness as we do here.

Data
We draw on the National Child Development Study (NCDS), a continuing study that follows the lives of 17,000 people living in Great Britain who were born in the week of March 3, 1958. We use data from the survey at birth, and ages 7, 11 and 16. We measure the NCDS child's occupation variables at age 33. We also draw on the 1970 British Cohort Study (BCS70). The BCS70 began by including more than 17,000 children born between April 5 and 11 in 1970. We use the data from the survey at birth, and ages 5, 10 and 16. We measure the BCS child's occupation variables at age 34. 20 To examine gender-based occupational sorting for a cohort entering the workforce soon, we draw on the Millennium Cohort Study (MCS). This group is now 17/18 years old and is about to decide what to do after their schooling finishes. The MCS follows the lives of around 19,000 children born in the UK in 2000 and 2001. We use the data from the survey at 9 months old (2001) and at ages 3 (2003/4), 5 (2005/6), 8 (2008/9) and 12 (2012/13). We utilise the aspired occupation, reported by the cohort member in the 2012/13 sweep.

Outcome variables
For the NCDS and the BCS, the information on occupation is measured by four-digit socio-Economic Classification 2000 codes (SOC2000) at ages 33 and 34, respectively.
For the MCS, the cohort members were asked in the 2012/13 sweep "by the time you are 30, which of the following would you be most likely to achieve?" followed 18 Occupations with relatively high 'people' content involve engaging with customers, clients or co-workers routinely (for example, nurses, physicians, social workers and teachers). Occupations with relatively high brains content are economists, financial managers, aerospace engineers and CEOs. Finally, occupations that are relatively high on 'brawn' include explosives workers, mechanical engineers and surveyors. 19 Some of these works include Dohmen and Falk (2011), Nielderle and Vesterlund (2008), Gneezy and Rustichini (2004) and Gneezy et al. (2003). 20 There are two waves of the BCS that surveyed cohort members when they were in their 30s. We use the 2004 wave (aged 34) as the main source of occupation. We supplement missing occupation information in the 2004 wave with occupation reported in the 2000 wave (when cohort members were 30 years old). by a list of choices which follow the 4-digit UK Socio-Economic Classification 2010 (SOC2010). We convert SOC2010 to SOC2000 occupation coding using a crosswalk provided by Lordan (2019). We acknowledge that aspirations reflect an attitude or a perception of occupation opportunity, and not labour market outcome as used in the NCDS and the BCS data. However, we view this cohort as particularly special as they are just about to enter the labour market at the time of writing, and so are the most interesting with respect to learning if trends continue. We also have some faith that this proxy is meaningful. For instance, if we perfectly match aspirations to occupations in the NCDS data we observe a correlation of 0.30, with the correlation with the share of males in an occupation being approximately 0.60. Additionally, it has also been demonstrated elsewhere that aspirations measures are strongly indicative of actual labour market outcomes (Genicot and Ray 2017; La Ferrara 2019; Lekfuangfu and Odermatt 2022). 21 Our occupation averages are calculated based on the 1993-2012 Quarterly Labour Force Survey data (QLFS) where we exploit the four-digit SOC00 occupational codes. 22 Therefore, the averages associated with each occupation are the same for the three cohorts, ensuring that δ captures the change in sorting towards or away from particular occupation types rather than composition effects. We calculate averages at the 4-digit occupation-level (SOC00) of the log of gross income, average hours and share of males and, subsequently, match these directly to the NCDS, BCS and MCS's SOC00 codes.
We also create a variable to proxy the wage-hours elasticity used by Goldin (2014). We create this variable by running a regression of the log of wages on log hours, occupation fixed effects, the interaction between log hours and the occupation fixed effects and several other controls using the 1993-2012 QLFS and consistent British SOC00 codes. The proxy is then the coefficients on the interaction between occupation and log hours.
In addition, we consider a variable that is assigned equal to 1 if an individual is dis-employed and zero otherwise. For the MCS, this is based on the child's response (asked at age 11/12) to a question on whether they expect to have a 'good' job by age 30. We interpret this as a proxy which is likely to be highly and positively correlated with future labour market attachment. We assign a variable equal to 1 if a good job is expected and zero otherwise.
Our analysis also utilises three variables which capture what a job is about. These variables are created following the approach described by Lordan and Pischke (2022). Specifically, we retrieve the items from ONET (version 5) that are related to the activities and content of an individual's work. These 79 items report the level at which an occupation has a particular characteristic from 1 to 7. Next, three latent factors 21 Occupation aspirations are also available for the BCS, but only in broad occupation groups (12 categories). Therefore, we report only the statistics deriving in the NCDS. 22 The QLFS is the main survey of individual economic activity in Britain and provides the official measure of the national unemployment rate. The QLFS uses British SOC90 codes from 1993 through 2000 and SOC00 from 2001. We first assign to each SOC90 code a SOC00 value based on a crosswalk created from the British Household Panel Survey (BHPS). This is possible because in the BHPS after the year 2000 every individual is assigned a SOC90 and SOC00 code simultaneously. This information allows for a consistent coding system in the QLFS based on SOC00.
'people,' 'brains,' and 'brawn' (PBB) are calculated using this data. We then match the latent factors (in the US Standard Occupation Codes (SOC) 2000) to the QLFS using the British SOC00 codes and, finally, match these three factors for each occupation to the NCDS, BCS and MCS data.
Next, we also use the ONET database for our measure of occupation competitiveness. Specifically, incumbents are asked: "To what extent does this job require the worker to compete or to be aware of competitive pressures?" with response options of 'not at all competitive' 'slightly competitive' 'moderately competitive' 'highly competitive' and 'extremely competitive'. We standardise this variable to have a mean of zero and a standard deviation of 1 and match to the NCDS, BCS, and MCS in the same manner described for the PBB factors. We note that summary statistics for all outcome variables are provided in Table 1.

NCDS, BCS and MCS control variables
We aim to consider a holistic set of controls that capture as many as possible the childhood variables that are simultaneously correlated with gender and the outcomes we consider. Across all three surveys, efforts are made for these controls to be measured at similar ages and have relatively consistent definitions. Fuller details of all controls, along with the relevant means and standard deviations, can be found in Appendix A. All covariates described below are standardised to have a mean of 0 and a standard deviation of 1. We run all our specifications using robust standard errors to allow for heteroscedasticity of an unknown form. The regression samples contain the same individuals across all specifications. Our estimation approach to Eq. (1) is sequential. First, we estimate Eq. (1) with the female dummy variable and no other controls. Subsequently, we add the following variables:

Demographics and socio-economic variables (fam i,child )
Childhood demographics and the socio-economic status variables are elicited as close to birth as possible (for the exact timing of the variables, see Appendix A). These variables are mother's age, father's age, social status, a set of marital status dummies indicating whether the child's parents were together or not in each wave, a dummy variable indicating whether the child's mother stayed in school beyond the minimum required age, household income, household tenure, a dummy indicating whether or not the child's mother worked, a dummy variable indicating low birthweight, a dummy variable indicating whether the cohort child was the first-born, a dummy indicating whether the cohort child was breastfed, region of residence and a dummy variable indicating if the cohort child is White and zero otherwise.

Cognitive ability scores (verbal and math/science) (cog i,child )
We draw on all available measures of the child's cognitive ability in the three surveys. 23 In the NCDS, at age 7 the child completed the Southgate Reading Test  Reading Comprehension Test and the General Ability Test. At ages 11 and 16, they completed the 35 Item Reading Comprehension Test. These five proxies of verbal ability are included in our regressions. The composite verbal ability score in the BCS is derived from tests conducted at ages 5 and 10. Specifically, at age 5, the 50-item Reading Test score, which is modified from the Schonell Reading Test, was administered. At age 10, the 21-item Word Similarity subscale of the British Ability Scale was administered. We also draw on the 67-item Shortened Edinburgh reading test, which assesses vocabulary, syntax, sequencing, comprehension and retention at age 10.
For the MCS, the composite verbal ability score is derived from instruments administered at ages 3, 5, 7 and 11. The British Ability Scale (BAS) was administered at ages 3 and 5, the Pattern Similarity and the Pattern Construction subscale of the BAS at ages 5 and 7, the 900-item Word Reading of the BAS at age 7 and the Verbal Similarity subscale of the BAS at age 11.
The Math-and-Science proxies for the NCDS include three tests measured at ages 7, 11 and 16, as well as teacher ratings for math and science. Specifically, these tests are the Problem Arithmetic Test score at age 7, a 40-item Arithmetic/Mathematics Test at age 11 and a 31-item Mathematics Test at age 16. We also include the teacher's ratings which reflect their perception of the child's ability in maths and science. These rating scores equal 3, 2, 1 and 0 if the teacher thinks the child's ability is equivalent to A-level, high-graded GCSE, low-graded GCSE and below GCSE, respectively. 24 For the BCS, we draw on tests administered at ages 10 and 16. 25 These are the Friendly Maths Test and the Recall of Digit subscale of the BAS administered at age 10, and the actual raw GCSE scores for Maths and four science subjects (science, chemistry, physics and biology) reported at age 16. Finally, for the MCS we draw on tests administered at ages 7 and 11. The NFER Number Skills test was taken at age 7. We also include the teacher's evaluation of the MCS member's ability at age 11 in maths, science and technology. These scores equal 5, 4, 3, 2 and 1, respectively, when the teacher evaluated the MCS member to be: well-below average, below average, average, above average and well-above average. Summary statistics can be found in Tables A.2 through A.3 in Appendix A.

Motor skills (gross motor and fine motor) (gross i,child , fine i,child )
We consider instruments related to both gross motor and fine motor skills, captured across childhood as our proxies for motor skills. Specifically, gross motor skills are those which require whole body movement and involve the core stabilising muscles of the body to perform everyday functions, such as standing, walking, running, and sitting upright. It also includes eye-hand coordination skills such as ball skills. Fine motor skills are smaller movements. They include clothing fastenings, cleaning teeth, using cutlery drawing, writing and colouring, as well as cutting and pasting.
For the NCDS, we draw on teacher-assessed measures of gross motor skills at ages 7 and 11 that follow definitions by Sigurdsson et al. (2002). 26 This definition is consistent in the BCS with measurements taken at age 10. In the MCS, gross motor skills are estimated using a subset of the Denver Developmental screening test (Frankenburg and Dodds 1967) and are assessed at age 1 by the child's parent.
We have two measures of fine motor skills in the NCDS. These are based on the teacher assessment of the Human Figure Drawing test (age 7) and the Copying-Design test (ages 7 and 11). The same measures of fine motor ability are available in the BCS (age 10). For the MCS, we again draw on relevant sub-components of the Denver Developmental Screening test, assessed at age 1 by the primary carer. Summary statistics can be found in Tables A.2 and A.3 in Appendix A. 24 Students completing A-levels stay in school until roughly 18 years and generally aim for third level education. A certificate of secondary education (CSE), O-levels or a general certificate of secondary education (GCSE) represent a low-level secondary school qualification that is usually achieved when the student is aged 15. 25 These are the Friendly Maths Test and the Recall of Digit subscale of the BAS administered at age 10, and the actual raw GCSE scores for Maths and four science subjects (science, chemistry, physics and biology) reported at age 16. 26 Because of the absence of direct and positive measures of gross motor skills in the NCDS and BCS, Sigurdsson et al. (2002) exploit the five measures of gross motor impairment (rated by class teacher) and calculate for the average score of gross motor deficiency. Smaller scores indicate more positive gross motor development. In the MCS, we can use Denver Developmental Scale to directly measure gross motor skills at early ages.

Non-cognitive skills (externalising and internalising behaviour) (behav i,child )
We construct two separate measures of non-cognitive skills in childhood which proxy externalising behaviour and internalising behaviour, respectively. For all three surveys, we choose assessments provided by teachers over parents. In the NCDS, the behaviour scores are calculated from relevant items taken from the Bristol Social Adjustment Guide (BSAG) at ages 7 and 11. Essentially, we separate the questions into proxies that represent internalising and externalising behaviour. For the BCS, we utilize the Rutter's Behavioural Scale at age 10 (Rutter 1967) in the same way. Finally, measures of internalising and externalising behaviour in the MCS are constructed from the Strength and Difficulty Questionnaire (SDQ) at age 7. Summary statistics can be found in Tables A.2 and A.3 in Appendix A.

Health conditions (childhood physical health issues, psychological health issues) (phy i,child , psy i,child )
We follow Goodman et al. (2011) when constructing the measures of psychological and physical health in childhood and classify physical health issues into (a) major physical health and (b) minor physical health issues. For the NCDS, medical assessments were conducted at ages 7 and 16. At age 11, the information came from the parent's report.
In the BCS, the medical assessment was administrated at ages 5, 10 and 16. For the MCS, we use parental reports on the child's medical conditions at ages 3, 5, 7 and 11. The psychological health measures in the NCDS are calculated from medical examinations capturing emotional maladjustment at ages 7 and 16, as well as parental reports of the child's mental health support visits for ages 11 and 16. In the BCS, we follow similar classifications to the NCDS, drawing on data from parent reports of mental health support visits at ages 10 and 16. At age 16, the child also went through several medical assessments, capturing aspects of emotional maladjustment. Finally, for the MCS, psychological maladjustment is captured by teachers' and parents' reports of mental illness at ages11/12, and reports of adolescent mental health services utilisation at school. Summary statistics can be found in Tables A.2 and A.3 in Appendix A.

Parental investments (invest i,child )
We add measures of parental inputs, which mainly capture time inputs. For the NCDS, we draw on parent-reported frequencies at age 7 of how often the mother and father reading to their child; teacher-rated levels of parental interest in their child's education at age 7, and parental reports of engaging in various activities with their child at ages 7 and 11 (see Appendix A). For the BCS, we use the information on the parental-assessed frequency of the mother and father read to their child measured at age 5, teacher-rated levels of mother and father interest in their child's education at age 10 years and parental reports of engaging in various activities with their child at ages 5 and 10. For the MCS, we draw on the frequency of reading, and parent-assessed frequency reports at ages 3 and 5 of whether the parents: (i) read to the child, (ii) tell stories to the child, (iii) paint with the child, and (iv) play music with the child. We also draw on teacher-rated levels of parental interest in the child's education when aged 11 years, and a set of variables on the frequency of parents visiting various places with the child at ages 3 and 5. Summary statistics can be found in Tables A.2 and A.3 in Appendix A.

Parental aspirations for the child (aspire i,child )
We also consider measures of parental aspirations. For the NCDS, we draw on parent reports at age 16 that are equal to 1 if a parent wishes their child to leave school at age 15 years, and 0 otherwise. We consider additional measures of aspirations, measured at age 16, which correspond to how long the parent wishes their NCDS child to stay in education (16 years, 18 years, beyond 18 years or uncertain).
In the BCS, we draw on parent reports measured at age 16. We create a dummy variable that is equal to 1 if a parent advises their child to leave full-time education immediately after age 16, and 0 otherwise. We also consider additional measures of aspirations, measured at age 16, which correspond to how long the parent wishes the BCS child to stay in education (leaves at 16, finishes A-levels, goes to university, uncertain).
In the MCS, we draw on parent reports when the child is aged 11 on their perception of whether their child will attend university. Specifically, parents are asked to estimate the likelihood they think the child will attend university: very likely, fairly likely, not very likely and not at all. The responses are then added to the regression as a set of dummy variables.

External influences (external i,child )
Finally, we try and capture aspects of the child's external environment. Given the data available, the variables we consider mainly capture the school environment. For the NCDS, we include a set of dummy variables that capture whether the child's teacher perceives they would benefit from further education (measured when the NCDS child is 16. We also include a separate set of dummy variables which capture the teacher's expectation of the highest level of education the child is likely to attain (university, lower college, advanced course, certificate, other further education, parttime professional qualification, other part-time education, and no other qualification). Similarly, for the BCS we include a set of dummy variables which capture the BCS child's teacher report on whether further education will benefit the BCS child. A further set of indicator variables are constructed from the teacher's expectation of whether the child would attend further education after age 16. For the MCS, we also add similar measures. These measures indicate the teacher's perception of whether the MCS child would (a) stay in full-time education after age 16, and (b) attend university. These variables were measured at when the MCS child was 11 years.
For the NCDS and BCS, we also capture some characteristics of the child's classmates. For the NCDS and the BCS, these variables are measured at age 16 and relate to the school-wide: (i) share of fathers from non-manual occupations, (ii) share of students staying on at school last year, (iii) share of girls obtaining at least two pass grades of GCSE or equivalent, (iv) share of 15-year old girls studying GCSE or equivalent only, (v) share of boys obtaining 2 passes of GCSE or equivalent, and (vi) share of 15-year old boys studying GCSE or equivalent. Summary statistics can be found in Tables A.2 and A.3 in Appendix A.

Inclusion criteria
We begin with the total observations from the first sweep of each cohort survey (NCDS = 18,558,BCS = 18,752,MCS = 19,518). We drop observations with missing values on gender (NCDS = 4, BCS = 326, MCS = 700). We drop observations with missing values on realised occupation around age 33/34 years old (the NCDS and the BCS). 27 For the MCS, we drop observations with missing values on aspired occupation in the fifth sweep (age 11). The exclusions up to this stage reduce the NCDS, the BCS and the MCS samples to 11,469, 10,234 and 11,200 observations, respectively. We then match the occupation around age 33/34 to the associated occupation averages, generated from the QLFS and ONET. We are left with the samples of 9722, 8973 and 11,200 observations for NCDS, BCS and MCS, respectively. (See Appendix Table  B.1).
An issue when working with cohort data, with this many variables and waves, is that there are many missing values. To address this problem, we apply mean imputation with missing indicator variables (the so-called Missing Indicator method) to the control variables described in (i) through (vii) above. When a variable is missing, we replace it with the average value from the non-missing sample. In the finalised sample, the share of females is 49.5%, 48.4% and 51.3% in the NCDS, the BCS and the MCS, respectively. (See Table B.1 in the Appendix.)

Gender sorting in labour markets across birth cohorts
Table 2 details the coefficient on the female dummy with its associated standard error, alongside the adjusted R 2 for each model. As we move through the rows from (i) to (ix) in each panel, we are estimating richer variants of Eq. (1). It is striking that while there are significant differences in the female coefficient across all three cohorts, the coefficient is not attenuated when childhood variables are added. For example, from panel A (i), a female born in 1958 chooses an occupation where the share of males is 45% lower, on average, as compared to their male peers. This compares to 41% for females born in 1970, and 34% for females born in 2000. This suggests that over time UK females have been more often choosing occupations with a higher share of males; however, the gap between the aspirations of females and males in the most recent cohort is still marked. 27 There are two survey waves on life outcomes around age 30s for the BCS cohort (ages 30 and 34). In this analysis, we prioritise the responses given at the 34-year survey (1984). To maintain the size of observations of the BCS sample, we supplement any missing values of our variables at age 34 with the information given at age 30 to the exact questions. This strategy increases the final BCS sample from 6870 to 8973 observations. Running the models only with the 6,870 observations does not change the results. The tables show the estimated coefficient (beta), the standard error associated with the female dummy in each specification, and the adjusted R-square for each specification.

(I) is the regression with only female dummy. (II) is (I) with the family variables. (III) is (II) with childhood cognitive skills. (IV) is (III) with childhood motor skills. (V) is (IV) with childhood non-cognitive skills. (VI) is (V) with childhood physical and psychological health conditions. (VII) is (VI) with parental investment variables. (VIII) is (VII) with variables indicating parental aspirations for children. Finally, (IX) is (VIII) with external influences from school peer and class teacher around age 16
Turning to Table 2 panel B, there is no difference in the probability that females born in 1958 or 1970 will choose a job that has a share of males of 80% or higher as compared to their male peers. The coefficient on female for those born in 2000 is slightly reduced, implying that females in this cohort are 46% less likely to choose occupations with the highest share of males as compared to comparable males. This emphasises that females still shy away from occupations with the highest shares of males. 28 Across all three cohorts, the female coefficient is not markedly attenuated with the addition of childhood variables, and the adjusted R 2 remains flat. The result found here is consistent with the literature that shows that females are less likely to finish STEM majors (for instance, see Arcidiacono et al. 2016;Ost 2010). To explicitly check this result, we run an additional set of estimations with now the dependent variable is an indicator variable equal to 1 if the occupation is a STEM occupation and zero otherwise. Similarly, the female coefficients do not show much within-cohort variations after the inclusion of childhood variables. Instead, the female coefficient reduces its (negative) magnitude from the 1958 NCDS to the 1970 BCS, and eventually, becomes positive for the 2000 MCS (see Appendix Table D.1 for details).
Next, Table 2 panel C focuses on the probability a female is dis-employed (or for MCS children aspires to have a job that is not 'good'). For the NCDS, being female is associated with a 21% increase in the probability of being dis-employed as compared to male peers. This compares to 9% for the BCS. The female coefficient for the MCS is not significantly different from zero, emphasising that for the most recent cohort there is no difference in aspirations between males and females in their tendency to expect a 'good' job by age 30. The addition of the childhood variables does not either attenuate the female coefficient significantly for any of the cohorts, or change the R 2 notably. Table 3 panel A highlights that the BCS cohort have occupations with a smaller wage gap, as compared to females in the NCDS. However, the aspirations of the MCS cohort, if fulfilled, would cause a greater gender pay gap than those born in 1970 and 1958. Specifically, MCS girls are aspiring to do jobs that are paid 31% lower than males. The childhood variables do not attenuate the female coefficient significantly for any of the cohorts, and the R 2 is flat. For the BCS cohorts, adding the childhood variables actually increases the female coefficient, with the addition of cognitive skills having the greatest impacts. Table 3 panel B highlights that the propensity for females to sort into jobs with lower average hours, as compared to male peers, has declined significantly and substantively over the three cohorts. However, the difference between female aspirations and their male peers is still substantive for those born in 2000. The addition of the childhood variables gives modest attenuation across the three cohorts, but these additions do increase the adjusted R 2 for the 1958 cohort (from 11% in the most basic model to 14% for the fullest model). For the other two cohorts, the R 2 is relatively flat. Table 3 panel C documents the results for the flexibility regressions. Overtime, the gap between males and females in terms of sorting into flexible jobs has narrowed. Notably, the coefficient for the aspirations of the 2000 cohort is roughly one third of See notes to Table 2. For the MCS, probability of being employed equals 1 if the child expects to have their own children and also does expects to have a good job by age 30, and 0 otherwise that for NCDS children. For all three cohorts, the addition of the childhood variables does little to attenuate the female coefficient and the R 2 is flat. This suggests that changing patterns over time determined the movement away from flexibility, over and above childhood factors. Next, we turn to examine gender sorting in terms of job content. Table 4 documents the estimates from models which a series of job content ('people', 'brains', brawn') is used as the dependent variable. A few stylised facts emerge. First, over time females have moved more towards jobs that are high in brains as compared to their male peers, with the MCS girls being substantively more likely than the MCS boys to choose jobs with high brain content. Females across all three cohorts choose jobs with higher people content and lower brawn content, as compared to male peers. However, for the MCS cohorts the coefficients are about the half the size. This suggests that females in this cohort are still choosing job content along gender lines, but it is not as marked as it was for older cohorts. Finally, across all three cohorts, there is a gender gap in the propensity to choose jobs that are highly competitive, with males choosing work with higher competitive content. Markedly, there is no attenuation in this trend over time, with the MCS girls being substantively less likely to aspire to work in jobs that are competitive as compared to MCS males. Second, while there a couple of exceptions (that is, 'brains' for NCDS and 'brawn' for BCS), in general, the addition of the childhood variables does not attenuate the female coefficient in the job content regressions. Third, across the three cohorts, there are substantive changes to the R 2 when we add the childhood variables for the 'brains' and 'brawn' regressions (in both cases the addition of the cognitive proxies is the most important). By contrast, the R 2 is static despite these additions for the people and competitiveness content regressions.

Gender sorting among high cognitive ability individuals
Tables 5, 6 and 7 document estimates from Eq. (1) for children who are in the top 25% of the cognitive distribution. There has always been less of a difference between high skilled females and their male peers in their tendency to sort into jobs with high shares of males. While the gradient of the coefficient decreases across the three cohorts, it is flatter as compared to the average female regressions. Notably, females with high cognitive ability born in 2000 are still aspiring to enter occupations with 27% lower male shares, as compared to males. Panel B highlights that high ability females have become less likely than their male peers to sort/aspire into jobs with the highest shares of males. Given that these regressions pertain to the highest skilled females only, the trends of females sorting into science, technology and engineering are not improving across the three cohorts.
Consistent with the results for the average female, the gender gap in occupational hours (Table 6, panel B), flexibility (Table 6, panel C) and the potential to be disemployed (  See notes to Table 2 and also Appendix A for full details of how the children with highest cognitive ability are determined  Table 2 and also Appendix A for full details of how the children with highest cognitive ability are determined in earlier cohorts, but are less extrinsically motivated as compared to their current male peers. Turning to Table 7, stylised facts consistent with Table 4 emerge. First, the gap between males and females in terms of job brains content has grown over time, with females choosing jobs with higher brain content as compared to their peers. Second, females across all cohorts choose jobs with higher people content, but lower brawn and competitiveness content. Notably, for the MCS cohort the difference between males and females in brawn content is lower than for the NCDS and BCS females, but the competitiveness gap is the larger. Across Tables 5, 6 and 7, childhood variables do little to attenuate the coefficient of the female dummy. However, the addition of childhood variables does independently explain significant proportions of the variation for a number of outcomes (the log of average occupational income, brains content, and brawn content for all three cohorts).  LASSO results). First note that in no case is the coefficient on female shrunk to zero leading to the conclusion that gender has been and is now, a key factor in determining how individual's sort. The narrative from the OLS models remains. Overtime, females have sorted more regularly into traditionally male-dominated jobs, have decreased their propensity of being dis-employed as compared to male peers and the gender gap has narrowed in terms of flexibility and hours.

LASSO estimations and gender effect
However, for both average and high ability children born in the year 2000 there is a larger greater gender gap in the propensity to pursue jobs that are with over 80% share of males, in average occupational income and occupational competitiveness. Over time, all females have preferred jobs with higher brains and brawn content, as compared to their male peers. The average female also has been choosing jobs with less people content over time, as compared to their male peer. However, for females with high ability the preference of the 2000 cohort is for jobs that are higher in people content as compared to those born in 1970.

Measurement errors in childhood variables
The childhood variables we consider do not explain a significant proportion of the variation in any of our outcomes over and above what is explained the gender dummy. So, we conclude that the variables we usually think about as being important during childhood do not explain differences in gendered sorting, but gender itself is still a notable and independent determining factor. An obvious conclusion is perhaps the childhood variables we consider are simply measured with error. An easy way to explore this is to look at other outcomes that we think as important in adulthood, and look to see how our childhood variables relate to them. We are specifically interested in gauging their impact on the R 2 . Appendix Table D.5 documents the estimates from this exercise. We document for a number of adult outcomes in the NCDS and BCS (we do not yet observe adult outcomes for MCS children) the coefficient on the female dummy and its associated standard error, along with the $${R}ˆ{2}$$ when we estimate regressions with the female dummy only and our fullest specification. We note that for most of these outcomes adding the full set of the childhood variables explains a significant amount of the variation in our outcome. Additionally, for many of our adult outcomes, the female coefficient does change substantively with the addition of the childhood variables. See, for example, the regressions that relate to general health status, attitudes towards racial issues, smoking behaviour in adulthood and the probability of attending university. Overall, we are therefore confident that the variables do measure something meaningful about childhood, but that these variables are not important determinants of occupational sorting nor correlated with gender. We acknowledge that there may be other childhood variables not included in our dataset that may determine occupational sorting.

Gender differences in childhood variables across the birth cohorts
Another potential explanation why we do not observe substantial roles of the childhood variables included in our estimation is that there may be large gender gaps in these measures, and that the gaps may also evolve over time. For instance, Baker and Milligan (2016) show that boys in the MCS receive fewer parental inputs and have lower cognitive test scores. To explicitly check this, we run supplementary linear regressions with each childhood variable as the dependent variable and the female dummy is the explanatory variable with no other control variables. Appendix Table D.6 shows the estimation results from the regression described above. Each cell in the Table reports the estimated coefficient of the female dummy for each individual regression) for each cohort (NCDS, BCS and MCS, respectively).
In panel A "family environment at childhood", the majority of childhood variables show significant gender gaps. In panels B to G, we do observe that gender gaps exist across many of our childhood variables. However, it is worth noting that some gaps are in favour of males (negative values), while some gaps are in favour of females (positive values) with no particular patterns. For instance, some measures of cognitive skills show positive female coefficients, but others are negative. This is not different from previous literature, which agrees that boys generally do better at numeracy tasks while girls excel in literacy skills (see Cobb-Clark and Moschion (2017) for an extensive discussion). For parental beliefs about the child's future, we observe that for the NCDS, parents are biased towards boys. This is also the case for teacher's belief. 29 Nonetheless, this pattern of favouring boys then reverses when we check with the later cohorts (BCS and MCS). Instead, both parents and teachers become more optimistic towards girls' academic and career future. Over the cohorts, we also cannot pinpoint clear trends in the evolution of the female coefficient. Based on this finding, we would argue that the issue arising from existing gender gaps in childhood measures may not be as concerning.
Since gender gaps in teacher's beliefs do exist, one may want to know how influential a teacher's belief on child's perception of their future, and whether such teacher's influence may evolve over the decades. Ideally, we would like to run a simple regression with actual occupational achievement as the dependent variable and teacher's occupational expectation as the explanatory variable. This set of variable exists only for the NCDS. Therefore, to be able to repeat the analysis across three birth cohorts, we have to use the educational variables as the alternative. In details, we run a simple regression with the educational attainment (an indicator whether attaining college or above) on the teacher's educational expectation, and including the female dummy and its interaction with the teacher's expectation. For the MCS, since we do not yet observe the actual educational outcome yet, the alternative dependent variable is the child's educational aspirations (whether they aspire to attain a college education or above). The result in Appendix Table D.7 shows that teacher's educational expectation can explain a large variation of the child's educational goal. But the size of the correlation reduces over the decades (0.693, 0.391, 0.203 for NCDS, BCS and MCS, respectively). For the earlier cohorts (NCDS and BCS), there is, in fact, no gender difference in the correlation of teacher's expectation and child's academic goal (that is, the interacted terms are insignificant). We observe around 8% gender difference only in the MCS.

Differential growth patterns by industry
In the UK, the services sector experienced significant growth between 1958 and 1970, and the landscape of the goods producing sector was shaped by both offshoring and automation, which contracted and changed the nature of work in terms physicality (Autor and Dorn 2013;Autor et al. 2015;Lordan and Neumark 2018). We are interested in learning whether our findings are driven solely by changes in industries that have experienced high growth levels, or whether they hold consistently across diverse industries. We follow Olivetti and Petrongolo (2016) and run separate sub-sample analysis for the NCDS and BCS cohorts separately by goods producing sector and services sector. 30 The estimates are documented in Appendix Table D.8. Within sector, in general the childhood variables do not attenuate the female coefficient significantly. The few exceptions are for share of males and brains BCS good's sector regressions and the NCDS's competitiveness regression for the services sector. We note that while there are differences in the female coefficient across the two sectors within the cohort data, both sectors mainly display the same gendered sorting patterns. However, over time, the change in the female coefficient for the goods sector is far greater-suggesting that 30 Due to sample size limitation, we cannot produce estimates by one-digit industry codes. We are unable to include the MCS cohort as the MCS child's aspirations do not include industry. The sectors in the NCDS and BCS data are consistently coded according to the UK SIC-1992. Based on Olivetti and Petrongolo (2016), agriculture, hunting, forestry and fishing; mining, quarrying and utilities; manufacturing and construction are grouped as the goods producing sector. Wholesale and retail trade; hotels and restaurants; transport and storage; financial intermediation; business activities and real estate; public administration and defence; education; health and social work; and other services are classified together as the services sector. trends in this industry away from physical strength may have greatly reduced aspects of gendered sorting.

Shifting occupational preferences
It is also interesting to consider whether it changes in the preferences of males versus females that drive the conclusions. Appendix Table D.9 documents the average and standard deviation of our outcomes for each cohort, alongside the overall change occurred between 1958 and 2000. Table D.9 highlights that for the most recent cohort, it is males' increased propensity of pursuing high income and competitive work that is causing us to conclude that the sorting trends of the most recent cohort in these domains is less comparable to male peers as compared to the previous two cohorts. Across the British birth cohorts, females have sorted into occupations that have higher income and are more competitive, but not in the same extent as males. Moreover, males have also more regularly pursued jobs with high people content over time. Meanwhile, for all three cohorts, females more regularly choose jobs that are high in people content.
By contrast, females have continued to pursue jobs with high brawn content over time. Yet, males also regularly choose jobs that are high in brawn. Across cohorts, males have sorted into occupations with similar average weekly work hours (approximately 46 h), whereas there is an increase in females in occupations with higher average hours over time. British women in the recent cohorts have chosen occupations with less job flexibility, whilst across cohorts, men's jobs have continued to be in highly inflexible jobs. Overall, the comparison between the changes across cohorts and within cohorts suggests that the shifts in gender sorting patterns across cohorts are more marked for females as compared to males, with the exception of changes in aspirations for high income and competitive work by males in the most recent cohort.

Conclusions
Childhood factors, which encompass not only early skills (cognitive, non-cognitive, health), but also family environment, are known in the literature for their important role in shaping later achievements in life, including occupational choices. Moreover, the literature also documents gender gaps across several of these childhood factors (e.g. Buchmann and DiPrete 2008;Bertrand and Pan 2013;Cobb-Clark and Moschion 2017;Schleicher 2019). In this study, we begin by asking to what extent these important determinants can alter the role of gender in male and female sorting into different jobs within cohorts. Overall, we find that childhood factors, as measured by the variables that we capture in our data, do not noticeably change gendered sorting patterns, which is proxied by the coefficient of the female dummy, as we had anticipated. In contrast, by comparing the size of the coefficient in the same specification across birth cohorts, we observe that the female coefficient declines over the decades. We view this as highly suggestive evidence that it is temporal changes that matter the most in shaping these patterns.
We are aware that our childhood variables do not adequately capture every childhood factor that could potentially mediate gendered sorting. However, we would expect that some of the relevant omitted factors would be correlated with the variables that we do include so their signal should be picked up in our regressions. We acknowledge, however, that we do not capture a universe of childhood variables to allow us rule out every childhood factor that can determine occupational sorting. We also note that we cannot pinpoint what exact temporal shift shaped sorting and would speculate that it is a multitude of structural shifts, such as, the changing nature of tasks in many occupations, lower tolerance levels towards discrimination in the workplace, policy change, differential growth patterns across industries and the ability for women to delay their fertility.
Our analysis has also revealed several interesting stylised facts regarding gendered sorting over time. First, for all three cohorts, we find strong evidence of sorting along gendered lines, but this tendency has decreased substantively over time. That is, the gender gap has narrowed. Our analysis also reveals persistent gender gaps in the tendency to sort into occupations with the highest shares of males that have not changed over time. These jobs are often the golden pathway to C Suite positions and positions of power, and encapsulate science, technology and engineering posts as well as front office trading roles and politics. It may be tempting to conclude that the flatness in the gender gap in the tendency to sort into occupations with the highest share of males, particularly for children with the highest academic ability, reflects innate preferences. However, we note that over time both genders have significantly changed their tendency to sort into occupations that are high on people, brains and competitiveness content. 31 Some of this will be determined by labour markets (i.e. it is unsurprising that both genders sort towards jobs that are high in people, given the growth in services and jobs that require interpersonal skills). Yet, we also view these changes as highly suggestive that preferences are socialised, rather than representing innate differences by gender.
While all eyes are normally on the tendency for females to change preferences, our analysis reveals that the preferences of males are contributing to stubborn gender gaps in traditionally male-dominated positions. Noteworthy, is that males in the most recent cohort are aspiring to work in occupations with significantly higher levels of competitiveness and larger incomes as compared to previous cohorts and their current female peers. Therefore, even though the females born in 2000 have nearly closed the gender gap in terms of hours and flexibility they are demanding, the type of work they are aspiring to sort into suggests that the gender pay gap may prevail unless the rewards given to different occupations change, or indeed preferences change for even younger cohorts.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line 31 Females have also changed their tendency to sort into jobs with high brawn content.
to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/ by/4.0/.