1 Introduction

Gender segregation across industrial sectors persists as a prevailing characteristic of the United Kingdom (UK)’s labour market (Government Equalities Office 2019; Irvine 2022), with the Covid-19 pandemic exacerbating this trend (Open Society Foundations 2020; Johnston 2021). Between 2005 and 2020, the share of women in total employment in the UK exceeded 70% in sectors such as education, health and households as employers,Footnote 1 while it was below 30% in sectors like agriculture, mining and quarrying, manufacturing, construction and transport (Table 1). To support equal treatment of workers in the workplace and improve gender diversity across industries, the UK government has implemented several reforms in the past decade, including the Equality Act 2010 (EA2010), which sets out several measures prohibiting, among others, gender discrimination in employment and pay.Footnote 2 Despite these initiatives leading to a more balanced participation rate (Office for National Statistics 2022), the overrepresentation of women in certain industrial sectors is responsible for the sorting of women across occupations and disparities in both employment opportunities and wages (Olsen et al. 2018; Razzu and Singleton 2018).Footnote 3

Table 1 Share of women, by sector and year

This paper uses the UK Labour Force Survey (LFS) quarterly data to investigate gender sectoral segregation in the UK between 2005 and 2020 by addressing two questions: (1) how gender sectoral segregation relates to the type of employment contracts (i.e. part-time, permanent, remote work, number of weekly working hours) and hourly wages; and (2) to what extent gender wage differentials differ in terms of observable and unobservable factors in female- and male-dominated sectors.

The first question is addressed through propensity score matching (PSM) by estimating the average differences in labour market outcomes between workers in female- and male-dominated sectors with similar observed socio-demographic and working characteristics.

To answer the second question, we first use the threefold Kitagawa (1955)-Blinder (1973)-Oaxaca (1973) (KBO) decomposition to explore differences in wages due to human capital and productivity, or unexplained factors, or their simultaneous effect. We then estimate Mincer wage regression to explore the contribution of human capital and other observable skills. However, since there might be unobservable factors—e.g. behavioural traits such as self-esteem, ambition, competitiveness and willingness to make risky career choices (Gneezy et al. 2003; Gneezy and Rustichini 2004; Booth 2009; Bertrand 2011; Saccardo et al. 2018)—that may contribute to driving wage differentials, we retrieve predicted and residual wages from Mincer regression. We also conducted a counterfactual analysis to study how predicted and residual wages differ if workers (women in male- and female-dominated sectors and men in female-dominated sectors) had the same characteristics as men in male-dominated sectors.

Our main findings can be summarised in three points. First, gender-based sectoral segregation matters in the disparity of contractual opportunities, even controlling for occupational composition. Workers in female-dominated sectors are more likely to be segregated into atypical contracts (part-time), to work fewer hours and less from home and to earn less than their counterparts in male-dominated sectors. The penalty for men working in those sectors is even larger than for women. Second, human capital and background characteristics play a minor role in explaining gender wage differentials. Instead, most of the difference is driven by persistent “discriminatory constraints” (Altonji and Blank 1999), such as barriers in the labour market for women due to the effects of discrimination (i.e. unequal pay for equally qualified workers) and unobserved differences in productivity and tastes. Third, wage differentials in female- and male-dominated sectors would shrink if workers had the same potential and residual wages as men in male-dominated sectors. However, women in female-dominated sectors would always earn less than men in high-paid jobs due to the negative selection in the labour market.

While most of the literature focuses on the role of gender segregation in explaining the gender wage gap by looking at occupational and job dimensions (Blackburn et al. 1993; Watts 1992, 1995, 1998; Petrongolo 2004; Cortés and Pan 2018; Folke and Rickne 2022; Scarborough et al. 2021),Footnote 4 our work is closely related to the scant literature on the role of gender segregation across sectors (Moir and Smith 1979; Kreimer 2004; Campos-Soria and Ropero-García 2016; Kamerāde and Richardson 2018; Scarborough et al. 2021). These papers highlight how gender division of labour is still embedded in sectors (Carvalho et al. 2019), which is considered a structural factor shaping the differential effects on labour markets caused by economic recessions (Rubery 2010; Rubery and Rafferty 2013; Kamerāde and Richardson 2018) and the business cycle (Hoynes et al. 2012; Périvier 2014; Doepke and Tertilt 2016; Razzu et al. 2020; Piłatowska and Witkowska 2022). For instance, Olivetti and Petrongolo (2014) emphasise the interplay between gender trends and the evolution of the industry structure. Therefore, an understanding of the sectoral composition of the workforce is necessary to assess the trajectory of male and female employment and wages (Moir and Smith 1979).Footnote 5 Our contribution focuses on three main points. First, an innovative feature of this study is that it goes beyond the standard segregation index to identify female- and male-dominated sectors by further classifying sectors into high- and low-segregated. This index allows us to measure the degree of imbalance and the intensity of differences in gender distribution of a sector over time. Second, our analysis is not limited to the KBO results that show which effect drives wage differentials. Instead, we disentangle the effect of differences in average characteristics of male and female employees and the effect of selection into different sectors of women and men on the gender wage differentials. Third, since unobserved confounders may enter the decision to choose a specific (female or male) sector, we address this potential selection bias by calculating the predicted and residual wages from the Mincer regression. This approach is similar to the method used in the literature on migration to calculate individual potential earnings (Parey et al. 2017) and capture the part of earnings that is uncorrelated to observed skills (Gould and Moav 2016; Borjas et al. 2019).Footnote 6

Finally, we extend the findings of the 1980 s literature on the issue of “comparable worth”Footnote 7 (Treiman and Hartmann 1981; Maahs et al. 1985; Bielby and Baron 1986; Aaron and Lougy 1987). This literature found that the disproportion of women in female-dominated occupations is associated with lower pay in that occupation, on average, for all employees—men and women (Treiman and Hartmann 1981; Killingsworth 1987). However, the negative effect on the wage of being in such jobs is more significant for men than women (Roos 1981), even after controlling for relevant worker and job characteristics, including industry effects (Johnson and Solon 1984). We find that these negative results—i.e. worst job characteristics and wages for both men and women in female-dominated environments—are confirmed even when looking at industrial sectors. However, we found a more pronounced wage differential among women than men in female-dominated sectors at the top of the wage distribution.

The rest of the paper is structured as follows. Section 2 discusses the measures of gender sectoral dominance and segregation. Section 3 describes the data and reports some descriptive analysis. Section 4 presents the empirical strategy. Section 5 reports the estimated results. Section 6 concludes.

2 Conceptual framework

2.1 Gender sectoral segregation index

Gender sectoral segregation arises when a disproportionate share of men or women exists in a sector of the economy, independent of the nature of the job allocation (Watts 1998). In this section, we introduce the notion and measures of gender segregation in industrial sectors.

A sector is female-dominated (fml-dom) if the share of women employed in that sector is higher than the share of men in the same sector; it is male-dominated (ml-dom) if the share of men is higher than the share of women in the same sector. In formulae, the classification criterion for gender sectoral dominance is as follows:

$$\begin{aligned} \text {Sectoral Dominance} = {\left\{ \begin{array}{ll} \text {Female} &{} \text { if } \frac{W_{jt}}{W_{t}} > \frac{M_{jt}}{M_{t}}\\ \text {Male }&{} \text { otherwise } \end{array}\right. } \end{aligned}$$
(1)

where \(W_{jt}\) and \(M_{jt}\) are, respectively, the total number of women and men employed in sector j (SIC 1-digit) at time t; \(W_{t}\) and \(M_{t}\) are, respectively, the total number of female and male workers at time t.

The classification criterion defined in (1) uses the “majority voting” rule—i.e. the group with the largest number of members (either male or female) represents the sector.Footnote 8

Based on classification criterion (1), we define the Sectoral Segregation Index (SSI\(^s_t\)) as a measure of the degree of disproportion in the distributions of men and women in female- and male-dominated sectors at each time period. The index is based on the well-known Index of Dissimilarity,Footnote 9 which is used in labour (Watts 1998) and education economics (Zoloth 1976; James and Taeuber 1985) to study group composition and quantify the segregation among two groups (Cortese et al. 1976).

SSI\(^s_t\) is calculated for the two gender-dominated sectors (SSI\(^{{\text {fml-dom}}}\) and SSI\(^{{\text {ml-dom}}}\)) as follows:

$$\begin{aligned} {\text {SSI}}^s_{t} = \frac{1}{2}\sum _{j\in J_s} \bigg |\frac{W_{jt}}{W_{t}} - \frac{M_{jt}}{M_{t}} \bigg | \,\,\, \text { for all } t \text { and } s = \{{\text {ml-dom}}, {\text {fml-dom}}\} \end{aligned}$$
(2)

where \(J_s\) is the set of sectors in a male-dominated or female-dominated group. The index informs on time-varying group imbalance within gender-dominated sectors and ranges between 0 and 1. Large values of SSI\(^{{\text {fml-dom}}}\) (or SSI\(^{{\text {ml-dom}}}\)) flag large gender imbalance towards women (or men) and indicate the proportion of women (or men) that would have to either leave or enter each sector to avoid gender sectoral segregation. The value of the index remains unchanged when transferring workers between sectors within each gender group.Footnote 10

Table 2 List of high- and low-segregated sectors

Because the difference in the share of female and male employees can be extremely low in some sectors or high in others, we define an additional index that allows us to distinguish between high- and low-segregated sectors. Specifically, on average, sectors that display low (or high) segregation are classified as low (or highly) segregated sectors by ranking them from the least to the most segregated, based on SSI\(^s_{t}\).

Our analysis follows the UK Standard Industrial Classification (SIC) at one digit, which is used for business establishments by type of economic activity. We consider the 19 sectors listed in Table 2 as female- or male-dominated sectors and by the degree of segregation (high and low).Footnote 11

Figure 1 shows the evolution of SSI\(^s_t\) over 2005–2020 in the UK economy (upper graph), in low-segregated (bottom left graph) and high-segregated sectors (bottom right graph). The index follows a downward trajectory in male-dominated sectors (from 0.182 in 2005 to 0.158 in 2020), meaning that sectoral segregation has decreased. Instead, the segregation remains stable at around 0.17 over time in female-dominated sectors. In low-segregated sectors, gender segregation is larger in female-dominated than male-dominated sectors (with mean values of 0.021 and 0.011, respectively). Instead, in high-segregated sectors, gender segregation is higher in male-dominated sectors. However, it decreased over time (from 0.35 in 2005 to 0.30 in 2020) and dropped below the level of segregation in female-dominated sectors after 2015.

3 Data and descriptive statistics

3.1 Data source and characteristics of the sample

Fig. 1
figure 1

Evolution of SSI index over fiscal year

Our analysis is based on the Labour Force Survey (LFS) quarterly data from the UK Office for National Statistics (ONS). LFS is the most extensive household study in the UK, providing a comprehensive source of data on workers and the labour market. The analysis spans from fiscal years between 2005 and 2020. This period covers widespread enforcement of equality legislation, the 2007–2008 financial and economic crisis, and the recent changes caused by the COVID-19 outbreak.Footnote 12

We focus on working-age (16–64) employees, i.e. people who are in employment and paid a wage by an employer for their work.Footnote 13 Our final sample consists of 334,055 female workers and 307,245 male workers. The dataset includes variables on a wide range of (i) demographic characteristics (gender, age, country of birth, nationality, ethnicity, religion); (ii) socio-economic factors (presence of dependent children, marital status, education, experience, full/part-time job, remote work, public sector, training opportunities, sectors and occupations); (iii) geographical information on residence and working region. Information on wages in the LFS is the self-reported gross weekly pay for the reference week.Footnote 14 The number of hours is the total usual hours worked in the main job per week, including usual hours of paid overtime when applicable.

Table 3 reports the summary statistics of the main variables by gender. Regarding demographic characteristics, there is a substantial prevalence of UK natives in both male and female samples (87%), followed by non-European Economic Area (non-EEA) immigrants and EEA citizens. The average age is similar for men and women (around 40 years). On average, women in the sample are as educated as men (13 years of education) and have slightly less experience (22.99 years of experience against 23.46). More than half of the women and men are either married or cohabiting. In addition, 46% of women and 42% of men have dependent children under 18 years.

Table 3 Summary statistics for employed workers

Segregation variables show that 69% of women are employed in female-dominated sectors compared to 35% of men. Around 26% of both women and men work in low-segregated sectors, while the rest work in highly segregated sectors.

Focusing on the outcomes of interest, both female and male employees have the same share of permanent jobs (around 94%). Women’s wages (in logarithm) are, on average, lower than men’s (2.41 percentage points against 2.59 for men). A small share of employees work from home (5% of the women against 8% of the men). On average, women work 31 hours per week and 42% part-time, while men work 40 hours per week and 10% part-time. A more detailed investigation of the reasons for working part-time in Table 4 highlights that the share of women who did not want a full-time job is much larger than those who could not find it (78% against 10%). In contrast, 39% of men did not want a full-time job and 26% could not find one. Family duties and domestic commitments are among the main reasons for not wanting a full-time job (48% and 28%, respectively). Instead, men do not want a full-time job mainly because they are financially secure and work because they want (24%) or for other reasons (34%).

Table 4 Reasons for part-time work

3.2 Shift-share decomposition of employment

This section provides a descriptive picture of gender trends in employment in male-and-female-dominated sectors. We adopt a revised version of Olivetti and Petrongolo’s (2016) shift-share decomposition.Footnote 15

In our analysis, the growth of female employment share is decomposed into (i) the change in the total employment share of the sector (between component) and (ii) the change in gender composition within the sector (within component):

$$\begin{aligned} \begin{aligned} \Delta e_{st}^g= \underbrace{\sum _{j=1}^{J_s} \alpha _{jt}^g\Delta e_{jt}}_{\textrm{Between}\text {-}\textrm{sector}} + \underbrace{\sum _{j=1}^{J_s} \alpha _{jt} \Delta e_{jt}^g}_{\textrm{Within}\text {-}\textrm{sector}} \,\, \text { for } s=\{{\text {fml-dom}}, {\text {ml-dom}}\}, g=\{{\text {fml, ml}}\} \end{aligned} \end{aligned}$$
(3)

where \(\Delta e_{st}^g = \frac{E_{st}^g}{E_{st}} -\frac{E_{t_0}^g}{E_{t_0}}\) is the difference in the share of female/male employment between the base period \(t_0\) and the current period t; \(\Delta e_{jt}= \frac{E_{jt}}{E_{t}} -\frac{E_{jt_0}}{E_{t_0}}\) is the difference in the share of total employment in sector j between \(t_0\) and t; \( \Delta e_{jt}^g= \frac{E_{jt}^g}{E_{jt}} - \frac{E_{jt_0}^g}{E_{jt_0}}\) is the difference in the share of female/male employment in sector j; \(\alpha _{jt}^g = \frac{\left( e_{jt_0}^g+e_{jt}^g\right) }{2}\) and \(\alpha _{jt} = \frac{\left( e_{jt_0}+e_{jt}\right) }{2}\) are decomposition weights (i.e. the average share of female employment in sector j and the average share of sector j, respectively). The reference year is the first available year in the dataset (\(t_0=2005\)); s stands for sectors classified as female-/male-dominated according to Eq. (1).

Figure 2 displays the shift-share decomposition of female and male employment (left and right graphs, respectively). The graphs show the difference in employment with respect to the base year (2005) and its decomposition into between and within components (respectively, dashed and dotted lines) in female- and male-dominated sectors (crosses and circles).

Fig. 2
figure 2

Shift-share decomposition, by gender. Note: The graph shows the difference in employment in the comparison year with respect to the base year (i.e. the fiscal year 2005). The overall change in employment is shown in solid line and its decomposition into the ‘between’ and ‘within’ components, respectively, with dashed and dotted lines. The cross marks the components for female-dominated sectors and the circle the components for male sectors. The ‘between’ component (BTW) captures the change due to changes in the sectoral structure of the economy; the ‘within’ component (WTHN) reflects changes in female composition within sectors

The 2007–2008 financial and economic crisis harshly hit male employment. In particular, male employment share (between component) and male composition (within component) in male-dominated sectors suddenly decreased. Conversely, the crisis stimulated female employment, not only in the total employment share (both between components of female- and male-dominated sectors) but also in the composition of women in female-dominated sectors (within component). The Covid-19 outbreak arrested the overall female employment in both male and female-dominated sectors (between components). As expected, this led to a reduction in the composition of women in female-dominated sectors (within component) but to a sharp increase in the composition of women in male-dominated sectors.Footnote 16

Overall, female composition (within component) increased gradually in male-dominated sectors after 2010. It seems that EA2010 stimulated female employment from the demand side. This may suggest that a higher proportion of women were employed within each male-dominated sector at the expense of decreasing male employment.

4 Empirical strategy

4.1 Estimating gender sectoral segregation on employment contracts and wages

In this section, we outline the methodology employed to compare the average difference in labour market outcomes (permanent jobs, part-time jobs, working hours, remote work and hourly wages) in male- and female-dominated sectors among workers with similar observable skills and socio-demographic characteristics. We employ the propensity score matching (PSM) method for this purpose. This approach constructs a counterfactual group by matching workers in female-dominated sectors with those working in male-dominated sectors based on the same propensity score (i.e. the estimated probability of being employed in a female-dominated sector conditional on the observed characteristics). The estimand of interest is the Average Treatment effect on the Treated (ATT) which is computed as the average of the difference between the observed outcomes (\(Y_{1}\)) and the imputed outcomes (\(Y_{0}\)) for each worker in female-dominated sectors, \({\mathbb {E}}\{Y_{1i}-Y_{0i}|\text {Working in fml-dom sector}\}\). The imputed outcomes for each worker in female-dominated sectors are calculated by using the average observed outcomes of similar individuals working in male-dominated sectors. The underlying assumption is that those who choose to work in female- and male-dominated sectors only differ in the endowment of their observed skills and human capital accumulation.

In Sect. 5.1, we compare the average conditional outcomes between gender sectoral dominance for the full sample (male and female workers together), male sample and female sample. We estimate the propensity scores via a Probit regression. For the selection of the covariates to calculate the propensity scores, we consider all factors associated with working in female-dominated sector (including interactions and squares), and then we use an automatic selection procedure—i.e. Least Absolute Shrinkage and Selection Operator (LASSO). Table A3 in the Appendix reports the selected covariates from the penalised regressions. Then, we conduct the sensitivity analysis to test whether the balancing property of the covariates before and after the match holds. The covariates are balanced if the standardised bias after matching is within \(\pm 5\%\) (Rosenbaum and Rubin 1985). The matching method successfully builds a meaningful comparison group if the condition is satisfied.

4.2 Estimating wages in gender-specific dominated sectors

We now focus on the gendered differences in hourly wages in male- and female-dominated sectors based on observable and unobservable characteristics. For this purpose, we first perform the counterfactual Kitagawa–Blinder–Oaxaca (KBO) decomposition (Kitagawa 1955; Blinder 1973; Oaxaca 1973), which is an established method in the literature on discrimination (e.g. Mueller and Plug 2006; Blau and Kahn 2017) to study the difference in wages between women and men. We then run Mincer wage regression to explore the role of the average characteristics of male and female employees in female- and male-dominated sectors and to obtain the predicted and residual wages.

4.2.1 Decomposing the gender wage differentials

In this analysis, we use the threefold version of the KBO decomposition (Jann 2008), which decomposes the average difference of hourly wages (in logarithm) between men and women working in female- and male-dominated sectors into three components as followsFootnote 17:

$$\begin{aligned} \underbrace{{\mathbb {E}}(y^s_{ml}) -{\mathbb {E}} (y^s_{fml})}_{\textrm{overall}\ \textrm{difference}}&= \underbrace{\left[ {\mathbb {E}}({\textbf{X}}^s_{ml}) -{\mathbb {E}}({\textbf{X}}^s_{fml})\right] ' \varvec{\beta }^s_{fml}}_{\textrm{endowment} \ \textrm{effect}} \nonumber \\&\quad + \underbrace{{\mathbb {E}}\left( {\textbf{X}}^s_{fml}\right) ' (\varvec{\beta }^s_{ml} -\varvec{\beta }^s_{fml})}_{\textrm{coefficients}\ \textrm{effect}} \nonumber \\&\quad + \underbrace{\left[ {\mathbb {E}}({\textbf{X}}^s_{ml})-{\mathbb {E}} ({\textbf{X}}^s_{fml})\right] '\left( \varvec{\beta }^s_{ml} -\varvec{\beta }^s_{fml}\right) }_{\textrm{interaction} \ \textrm{effect}} \end{aligned}$$
(4)

where \({\textbf{X}}\) is a vector containing the covariates, such as socio-demographic variables, human capital variables and work-related variables; and \(\varvec{\beta }\) is a vector of slope parameters and the intercept; fml stands for women and ml for men, and \(s=\{{\text {fml-dom}}, {\text {ml-dom}}\}\).

The first component explains observable group differences in the predictors, such as background and human capital characteristics of workers (endowment effect). This effect quantifies the expected change in women’s wages if they had men’s characteristics. A negative endowment effect shows that female workers possess better predictors than their male counterparts.

The second term explains differences in the coefficients, including the intercept, that arise from discrimination—i.e. unequal pay for equally qualified workers (Blau and Kahn 2017)—or cannot be explained by differences in the observed factors (coefficient effect). Specifically, the coefficient effect measures the expected change in the average wage of women if they had the coefficients of men. The intercept included in the effect captures the contribution of unobservable characteristics (Cotton 1988)—e.g. behavioural traits, such as self-esteem, ambition, competitiveness and the willingness to take risky career choices (Gneezy et al. 2003; Gneezy and Rustichini 2004; Bertrand 2011; Saccardo et al. 2018). A negative intercept term is interpreted as “ongoing discriminatory constraints”, such as barriers in the labour market for the minority group due to the effects of discrimination and unobserved differences in productivity and tastes (Altonji and Blank 1999). When the overall coefficient effect is positive, women would have higher average wages if paid like men.

The third component explains the coexistence of differences in the endowments and coefficients between the two groups (interaction effect). If the interaction effect is positive, women have a “double disadvantage” because they have smaller coefficients than men when they have worse predictors; if it is negative, differences in coefficients and covariate levels offset each other (Jann 2018).

After assessing which effect drives the wage differences, in the following paragraphs, we investigate the contribution of each observable factor and unobservable characteristics that contribute to explaining the differences in wages between women and men in female- and male-dominated sectors.

4.2.2 The role of human capital

We use the Mincer regression to analyse the association of workers’ human capital and observable skills with wage differentials between sectors and genders. The estimating equation is as follows:

$$\begin{aligned} \textbf{y} = \textbf{X}\varvec{\beta } + \delta _t +\varvec{\epsilon } \end{aligned}$$
(5)

where \(\textbf{y} \) is hourly wages in logarithm; \(\textbf{X}\) is \(N\times k\) matrix of control variables (i.e. socio-demographic, human capital and work-related variables); and \(\delta _t\) are the time fixed effects. Equation (5) is estimated using OLS.

The set of controls includes three groups of variables as follows. Socio-demographic variables include age and its square, nationality, ethnicity, religion, being in a stable relationship, having dependent children and the interaction of the last two. Human capital variables are education (low, intermediate and higher education), experience and its square, years in education and its square and training offered by the current employer.Footnote 18Work-related variables include a dummy for female-dominated sectors, a dummy for low gender sector segregation, a dummy for working in the public sector and the type of occupation. Working region dummies are included.Footnote 19

Mincer regression is based on observable characteristics so as to “hold constant” individual factors that affect wages. However, this specification may not capture the selection of workers into sectors based on some relevant unobservable characteristics (e.g. self-esteem, ambition, competitiveness, risk aversion) that may influence wage differentials. In the next section, we address the possible bias that arises from omitting these unobservable factors.

4.2.3 Predicted and residual wages

We use the estimates of Mincer regression from Sect. 4.2.2 to measure how the selection on observable and unobservable characteristics shapes the difference in wages for men and women in female- and male-dominated sectors. First, we calculate the estimated returns to construct predicted wages, which measure the individual wage potential based on observable factors (Parey et al. 2017). Second, we follow Borjas et al. (2019) to shed light on the role of unobservable characteristics in the selection process by calculating residual wages, which capture the part of wages uncorrelated with workers’ skills.

We consider four sub-groups from our sample of workers: men in male-dominated sectors (ml, ml-dom), women in male-dominated sectors (fml, ml-dom), men in female-dominated sectors (ml, fml-dom) and women in female-dominated sectors (fml, fml-dom). In addition, we conduct a counterfactual exercise to examine the trajectory of wage potentials and residuals for each sub-group if workers had the same estimated coefficients of men working in male-dominated sectors. In formulae,

$$\begin{aligned}&\hat{\textbf{y}}^c_{g, {\text {gdom}}}= {\textbf{X}}_{g, {\text {gdom}}} \hat{\varvec{\beta }}_{g, {\text {gdom}}}\end{aligned}$$
(6)
$$\begin{aligned}&\hat{\textbf{u}}^c_{g, {\text {gdom}}}= \textbf{y}_{g, {\text {gdom}}} -\hat{\textbf{y}}^c_{g, {\text {gdom}}} \end{aligned}$$
(7)

where \(g=\{{\text {ml, fml}}\}\) is the gender of the worker, and \({\text {gdom}}=\{{\text {ml-dom, fml-dom}}\}\) is the sector with a large share of the specified gender. This analysis allows us to compare how predicted and residual wages would differ if workers (women in female- and male-dominated sectors and men in female-dominated sectors) had the same characteristics as the most advantaged group, i.e. men in male-dominated sectors.

Predicted and residual wages are sorted and used to construct the cumulative distribution functions (CDFs) by gender in female- and male-dominated sectors. Then, we compare the CDFs of men and women between and within gender sectoral dominance. The equality of the distributions of the (actual and counterfactual) predicted and residual wages among the four sub-groups is tested by using the nonparametric Kolmogorov–Smirnov (K–S) test.

5 Estimation results

5.1 Estimation results for the PSM on contracts and wages

In this section, we present the main results of the PSM by looking at three different samples (i.e. all workers, men and women). Working in a female-dominated sector is the treatment variable. Table 5 reports the ATT for each labour market outcome of interest—i.e. having a temporary job, part-time work, number of hours (in logarithm), remote work and wage (in logarithm).Footnote 20

Table 5 Propensity score matching

The first result of the analysis is that contractual features usually associated with female workers are more common in female-dominated sectors, even among men. That is, workers in female-dominated sectors compared to their peers in male-dominated sectors have, on average, fewer permanent positions (respectively, 0.947 against 0.956), work more part-time (0.35 against 0.267), fewer hours (3.404 against 3.475) and less from home (0.037 against 0.90). This result remains valid even when looking at male and female workers separately. Specifically, if men and women in a female-dominated sector were hired in a male-dominated sector, they would have more permanent positions (0.9 p.p. and 0.7 p.p., respectively), would work more hours (7.7 and 6.3 p.p.), less part-time (8.4 and 7.5 p.p.) and more from home (4.3 and 5.9 p.p.). All ATT estimates are significantly different from zero at a 1% significance level.

The second main result is that there is a higher penalty for men than women working in female-dominated sectors, given their larger magnitudes of ATT (all significant at 1% level). This result is also confirmed when looking at wage differentials between female- and male-dominated sectors. Men in female-dominated sectors earn, on average, 15.4 p.p. less than their male peers in male-dominated sectors. Instead, women in female-dominated sectors earn, on average, 12.6 p.p. less than their female counterparts in male-dominated sectors. Overall, any worker in female-dominated sectors would be paid 13.6 p.p. more if employed in male-dominated sectors. These results are consistent with the findings of “comparable worth” literature, that is, jobs dominated by women pay, on average, less all employees (Treiman and Hartmann 1981; Killingsworth 1987), and the effect on wages in such jobs is more negative for men than women (Roos 1981; Johnson and Solon 1984).

The sensitivity analysis (Fig. B1 in the Appendix) confirms that the balancing property is satisfied for all samples since all covariates are well balanced. Therefore, the matching method effectively built a valid control group.

Overall, our analysis suggests that gender sectoral segregation is a relevant factor in explaining observed differences in employment contracts (i.e. part-time, permanent, remote work, number of weekly working hours) and wage differentials.

5.2 Estimation results for wages

5.2.1 Results for the KBO

The evolution of the three components of the KBO decomposition and their sum over time is shown in Fig. 3. Women are contrasted to men within the same gender-dominated sector. The dashed line represents the coefficient effect, the long-dashed line the endowment effect and the dotted line the part of the interaction component. The solid line is the sum of the three effects and reveals their overall contribution (for the contribution of each characteristics, see Tables A5, A6 in the Appendix).

Fig. 3
figure 3

KBO decomposition, by gender sectoral dominance. Estimation note: Both models for women and men are estimated using the Mincerian regression equation (with OLS). The degree of gender segregation is not included because it is highly correlated with the grouping variable of gender sectoral dominance. The shaded areas are the 95% confidence intervals

The first result from the decomposition is that the difference in wages between men and women in both types of sectors is not so much explained by differences in human capital and productivity (endowment effect). The dynamics of the endowment effect shows that the gap in terms of observable characteristics has narrowed over time (the effect is close to zero), reflecting women’s increased human capital levels relative to men’s (as also observed by, for example, Goldin 2014; Blau and Kahn 2017). While men and women employed in female-dominated sectors are, on average, more similar in terms of human capital over time, the endowment effect is positive between 2010 and 2018 in male-dominated sectors, meaning that women have worse observed characteristics than men in those years.

Instead, the most relevant result is that the difference is mostly due to “ongoing discriminatory constraints” in the labour market towards women (coefficient effect) stemming from substantial unexplained constraints in labour market returns. (The intercept is negative from Tables A5, A6 in the Appendix.) The coefficient effect is positive in both gender-dominated sectors, suggesting that women should be paid more than men to prevent any sort of discrimination for reasons other than human capital characteristics and productivity.Footnote 21

The interaction effect explains little of the gender wage differential in both female- and male-dominated sectors, although we observe a “double disadvantage” for women (positive interaction effect) in male-dominated sectors only before 2010.

5.2.2 Results based on human capital factors

As the KBO showed that human capital characteristics play a minor role in explaining wage differentials, the analysis in this section shows the contribution of each observable factor on wages. Table 6 reports the estimated coefficients of the Mincer wage regression by gender for all sectors (Columns 1–2) and gender-dominated sectors (Columns 3–6).

Table 6 Mincerian regression results, years 2005–2020

The estimates of the Mincer regression for segregation variables confirm the main result of the PSM. Specifically, working in female-dominated sectors is significantly negatively correlated with hourly wages for both men and women (\(-\) 0.163 and \(-\) 0.158, respectively). In addition, working in sectors with low gender sectoral segregation is significantly positively associated with higher wages for male workers in the full sample (0.027) but negatively correlated with wages for women in both female- and male-dominated sectors (\(-\) 0.012 and \(-\) 0.102, respectively). The interaction term between female-dominated sectors and low gender segregation is positive and significant for women only.

Focusing on human capital characteristics, workers with higher educational attainment earn, as expected, more than those with low education. More years of education are positively associated with wages but with a diminishing effect (the square is negative). Our calculations show that the optimal number of years in education that maximises wages is approximately 15.7 years for men as opposed to 19.5 years for women in the full sample.Footnote 22 Therefore, women are expected to stay in education for more years than men, who need only a degree to earn their optimal wage. We obtained a similar number of years of education in female-dominated sectors (18 for women and 15.9 years for men), while the difference is less pronounced in male-dominated sectors (16.5 for women and 15.6 for men). Potential working experience has significant diminishing returns, and receiving training is significantly associated with an increase in the hourly wage, especially in male-dominated sectors.

Fig. 4
figure 4

CDFs of predicted wages, by gender and sectoral dominance. Note: The solid line is for men working in male-dominated sectors, the short-dashed line is for women employed in male-dominated sectors, the long-dash line is for men in female-dominated sectors, and the dash-dot line is for women in female-dominated sectors. Left: Predicted wages are calculated after estimating the coefficients of the Mincerian wage regression, reported in Table 6. Right: predicted wages are calculates using the estimated coefficients from the Mincerian regression of men working in male-dominated sectors. Predicted wages in the counterfactual exercise are precise measure of individual earnings potential (Gould and Moav, 2016; Borjas et al., 2019)

Fig. 5
figure 5

CDFs of residual wages, by gender and sectoral dominance. Note: The solid line is for men working in male-dominated sectors, the short-dashed line is for women employed in male-dominated sectors, the long-dash line is for men in female-dominated sectors, and the dash-dot line is for women in female-dominated sectors. Left: Residual wages are calculated after estimating the coefficients of the Mincerian wage regression, reported in Table 6. Right: Residual wages are calculates using the estimated coefficients from the Mincerian regression of men working in male-dominated sectors. Residuals from a Mincerian regression calculated in this way capture the part of earnings that is uncorrelated to observed skills (Parey et al., 2017)

For socio-demographic and job characteristics, being non-UK natives is significantly associated with lower wages. However, the reduction is, on average, larger in absolute terms for EEA than non-EEA, except for female-dominated sectors. The presence of dependent children penalises women’s wages but not men’s, independently of the sector. Further, working in the public rather than in the private sector is associated with higher wages for women than men. However, the coefficients are non-significant in male-dominated sectors. This suggests that the private sector pays more in male-dominated sectors while the public sector offers better remuneration in female-dominated sectors.

5.2.3 Results based on predicted and residual wages

This section discusses empirical evidence on the differences in the selection of workers in male- and female-dominated sectors in terms of observable (predicted wages) and unobservable (residual wages) characteristics. Figures 4 and 5, respectively, display the CDFs of potential and residual wages for the four subgroups: men in male-dominated sectors (ml, ml-dom), women in male-dominated sectors (fml, ml-dom), men in female-dominated sectors (ml, fml-dom) and women in female-dominated sectors (fml, fml-dom). The graphs on the left show actual values, calculated using the estimated coefficients for each subgroup from Table 6. The graphs on the right display counterfactual values calculated with the estimated coefficients of men working in male-dominated sectors.

The key result from the left graph in Fig. 4 is that there is a penalty in potential wages associated with working in a female-dominated sector. In fact, women in female-dominated sectors always have lower predicted wages than all other workers (CDFs always lying on the left). For low levels of potential wages, men employed in female-dominated sectors earn much less than women in male-dominated sectors.

If workers had the potential wages of men in male-dominated sectors, wage differentials of men and women across female- and male-dominated sectors would be smaller (Fig. 4 on the right). However, women in female-dominated sectors would always be paid less than all other workers. Men in female-dominated sectors would still be penalised compared to workers in male-dominated sectors, but only in low-paid jobs. But moving to the top of the distribution, the gap in terms of potential counterfactual wages between women and men increases, meaning that women would always earn less than men. These findings contrast Roos (1981) and Johnson and Solon (1984), who always find a more pronounced wage differential for men than women in female-dominated environments.

When we look at the residual wages in Fig. 5, the results highlight that differences in wages in high-paid jobs cannot be attributed to acquired skills or accumulated human capital only. Specifically, the CDF of women in female-dominated sectors (left graph) lies to the right of the other curves for low residual wages (positive selection at the bottom of the distribution) and to their left for high values (negative selection at the top). This means that these women earn more in low-paid jobs but much less in high-paid jobs than the other workers for reasons other than their skills and human capital.

In the counterfactual exercise (right graph), all curves would shift to the left of the CDF of male workers in male-dominated sectors, meaning that all workers would be negatively selected with respect to the former. At the bottom of the distribution, both women and men in female-dominated sectors would be penalised in terms of residual wages due to unobserved characteristics (the two CDFs overlap and lie to the left of the other two). However, as we move up to the distribution, the CDFs diverge, and women in female-dominated sectors are more negatively selected (laying more to the left) than their male counterparts and other subgroups.

The non-parametric K–S test always rejects the null hypothesis of equality of distributions among the four sub-groups (see Table A7 in the Appendix), confirming that the distributions of (actual and counterfactual) predicted and residual wages of men and women across sectors differ.

6 Conclusion

This paper studied how gender sectoral segregation relates to employment contracts (i.e. permanent jobs, part-time jobs, working hours, remote work) and hourly wages in the UK between 2005 and 2020. We further analysed the extent to which wages differ in female- and male-dominated sectors by looking at both observable and unobservable characteristics. Our empirical analysis suggested that the persistent imbalance in the shares of men or women in some sectors contributes to explaining the differences in employment contracts and wages.

We first found that female-dominated sectors reflect contractual characteristics typical of women. In other words, working in female-dominated sectors is associated with a greater reliance on part-time contracts, fewer hours and less working from home, even controlling for the occupational composition. In addition, female-dominated sectors seem to pay, in general, less for any worker regardless of their gender. The penalty for men working in female-dominated sectors is even larger than for women.

Second, women working in female- and male-dominated sectors are paid less not because of differences in human capital and productivity but rather because of the existence of persistent “discriminatory constraints”, such as barriers in the labour market for women due to the effects of discrimination and unobserved differences in productivity and tastes. This means that women have observable attributes similar to men regarding accumulated human capital, and without these discriminatory barriers, wage differentials between women and men within male- and female-dominated sectors would be lower.

Third, female-dominated sectors are not as rewarding as male-dominated sectors in terms of predicted and residual wages. While women in female-dominated sectors are always worse off than all other workers, men in female-dominated sectors are disadvantaged in low-paid jobs only. In addition, actual and counterfactual results for residual wages have documented the negative selection of women in female-dominated sectors with respect to all other workers, especially at the top of the wage distribution. The use of predicted and residual wages allows us to control the issue of selection based on unobservables that arises from the use of the PSM and Mincer regression, where the former matches workers with similar observed characteristics and the latter holds constant observed individual factors associated with wages.

This analysis has policy implications. Gender segregation in the labour market may be responsible for causing more challenges for women than their male counterparts regarding labour participation, access to jobs and career opportunities. This gap could potentially widen post-pandemic. Our findings can provide policy-makers with empirical evidence supporting appropriate reforms favouring vulnerable categories of workers (i.e. women, mothers and immigrants) and policies designed to sustain long-run economic growth, especially as the UK is facing new challenges (i.e. pandemic and Brexit). Future avenues of research could focus on gender segregation into sectors that have seen a rise in the use of atypical work arrangements, e.g. zero-hour contracts and casual work.