1 Introduction

Girls are underrepresented in STEM subjects at all educational levels (Osborne et al. 2003; Charles and Grusky 2004). Field of education directs occupational paths (Altonji et al. 2012), and STEM education often leads to higher wages in the labor market (Kirkeøen et al. 2016). Policy makers have therefore been concerned about how to increase female participation in these subjects. The low participation rates among girls are somewhat puzzling as girls tend to perform as well as or better than boys in STEM subjects in lower grades.Footnote 1 A possible explanation is that educational choices are also guided by social mechanisms connected to gender. We use Norwegian data to study how the gender composition of the learning environment affects educational choices of boys and girls. More specifically, we investigate whether the gender composition of students’ peers in lower secondary school (grades 8–10, age 14–16) influences educational choices in upper secondary school (grades 11–13, age 16–19). We also analyze the importance of two potentially significant mechanisms: the role of school performance (measured by grades in lower secondary school) and the role of social mechanisms (measured by pupils’ stated perceptions of discrimination and competitive behavior).

Norway scores high on most gender equality indices.Footnote 2 However, it is also among the countries with the largest STEM gender gaps in education (Stoet and Geary 2018). Only one-third of all STEM students in Norwegian universities are female, and the gender gap is even larger in the workforce, where women constitute merely 23% of the working population with a tertiary STEM degree. At the same time, Norwegian girls score on average better than boys in STEM subjects in lower and upper secondary school.Footnote 3 Norway therefore provides a useful context to study the role of peers and social mechanisms for educational choices.

A small body of literature, which has investigated the effect of peers’ gender on educational choices in a variety of contexts, yields mixed results. Schneeweis and Zweimüller (2012) use data from low track primary schools in Austria and find that having a higher share of female peers in primary school increases the likelihood that girls will choose a male-dominated, vocational school for secondary education. Anelli and Peri (2019) use data from college preparatory high schools in Milan, Italy. They show that the probability of male students enrolling in predominantly male majors increases when attending a \(>90\)% male class. In contrast, Brenøe and Zölitz (2018) use data from high school math tracks in Denmark and find that a higher share of female high school peers lowers women’s probability of enrolling in STEM studies and raises men’s STEM enrollment. Similarly, Zölitz and Feld (2017) use data from the last year of a bachelor’s program in business school in Maastricht and find that female (male) students have a lower (higher) probability of choosing a male-dominated major if the female share is higher.

Our research contributes to the literature by presenting a comprehensive study of the entire population of youth completing lower secondary school in Norway in the period 2003–2008. We focus on the effect of the share of female peers for choices in upper secondary school—the first possibility students have to make educational choices in Norway. The rich, individual-level registry data enable detailed analysis of choices of tracks and subjects in upper secondary school. Furthermore, we provide evidence on mediating mechanisms behind gender peer effects on educational choices. For this purpose, we use both registry data, to measure the role of school performance, and survey data from The National Pupils’ Survey (“Elevundersøkelsen”), to measure the role of social mechanisms.

To identify the effect of peer gender composition, we exploit the idiosyncratic variation in female share across cohorts within Norwegian lower secondary schools, measured at grade 10 when students are 16 years old. In Norway, the peers at this level are determined by institutional rules and regulations. Completing lower secondary school is compulsory; every student follows the same track, and school districts cover contiguous geographical areas. The gender composition of a grade is thus the product of random variations in how many boys and girls were born in a school district in a given year. Because a large portion of Norway is sparsely populated, the number of students in a cohort is often small enough to provide substantial variation in the female share that we use for identification.

Our results show that the peer gender composition in lower secondary school affects both boys’ and girls’ educational choices in upper secondary school. A higher female share increases the probability that both girls and boys will choose STEM courses, and lowers the probability that they will choose language courses (only significantly for girls). A higher female share also increases the probability of choosing a vocational track instead of an academic track (only significantly for boys). The direction of effects is therefore similar for boys and girls: girls choose more “female typical” (that is, less STEM and more language) when they are fewer; boys choose more “male typical” (that is, more STEM and more vocational) when they are fewer. There is no significant difference in the size of the coefficients between girls and boys, which means that both sexes react equally strongly to the gender composition of their peer group.

We find that school performance provides a probable mechanism for this effect: a higher female share improves girls’ and boys’ performance in math relative to language—which is consistent with the larger probability that they later choose STEM over language. However, improved relative performance in math may not be the only channel for the effect of the female share on educational choices. We find suggestive evidence that social mechanisms may also contribute to the impact of female share. For girls, less gender discrimination when the share of female peers is high seems to be a potential mechanism behind the effect of the peer gender composition on educational choices. For boys, an important mechanism appears to be an increased willingness to compete when the female share rises.

The paper is organized as follows. The next section presents our hypotheses about potential mechanisms by which gender composition in school could affect educational choices. Section 3 gives a brief description of the Norwegian educational system. The identification strategy is explained in Sect. 4, together with a presentation of the data, variables and sample. Section 5 presents the results, while possible mechanisms are discussed in Sect. 6. Analyses of robustness are presented in Sect. 7, and conclusions are provided in Sect. 8.

2 The link between gender composition and educational choices

The gender composition of students’ peers may affect educational choices in general and STEM participation in particular through many channels. First, the female share may influence students’ learning and performance—and thereby choices—both through the quality of peers and study partners and/or through changes in the classroom learning environment. Second, the female share may influence choices through social mechanisms, e.g., peer pressure to follow gender stereotypical behavior, confidence in own abilities, and/or willingness to compete. Our study focuses on three main channels: school performance, gender stereotypes and competitiveness.

Our first hypothesis is that the gender composition affects educational choices in part through its effect on school performance. Students are more likely to pursue a subject in which they perform well (Ost 2010; Rask and Tiefenthaler 2008), and the female share has been shown to affect school performance (e.g., Hoxby 2000; Whitmore 2005; Lavy and Schlosser 2011) and dropout rates (Anil et al. 2016). Most studies find that when the female share in school increases, both girls and boys perform better in general.Footnote 4 In a structural approach, Graham et al. (2010) analyze the effects of reallocating individuals across social groups in the presence of social spillovers. They illustrate their approach by studying the effects of sex segregation in classrooms on mathematics achievement. Their exercise on US data suggests that average math achievement will be maximized when approximately two-thirds of classrooms are 40% girls and 60% boys, and the remaining one-third of classrooms 70% girls and 30% boys. We investigate the role of school performance as a mediating mechanism, by estimating the effect of female share on grades in typical STEM subjects, e.g., math and science, as well as typical non-STEM subjects, e.g., language. If performance and choices are affected in the same direction (relatively better at math and science - choose more STEM), we interpret this as suggestive of performance being important for choices.

Our second hypothesis is that the gender composition of students’ peers may affect educational choices through less measurable factors like gender identity (Akerlof and Kranton 2000) and gender stereotypes. Developmental psychologists identify a period of strong adherence to gender stereotypes in early puberty (gender intensification) (Hill and Lynch 1983; Galambos et al. 1990). Pressure to follow gender stereotypes might influence both students’ choice of, and performance in, subjects that historically have been branded as typical “male subjects” and “female subjects.” Gender-stereotypical behavior has also been shown to depend on the peer gender composition in school: women at single-sex colleges have a higher physical self-esteem (Kessels and Hannover 2008), disproportionately enter prestigious, male-dominated fields (Tidball 1985, 1986) and more often switch to male-dominated majors during their studies (Solnick 1995). A larger share of girls in the classroom may lessen the importance of “femininity” in teens’ interactions with fellow students. In turn, this can foster the construction of a self-concept of their abilities and interests, leading them to conform less strongly to “gender appropriate” roles.

Our third hypothesis is that the peer gender composition affects educational choices partly through competitiveness. Many studies have found that women, on average, prefer less competition and perform worse in competitive settings (Niederle and Vesterlund 2007; Cai et al. 2018). Competitiveness could therefore influence the choice of study tracks if some subjects are viewed as more competitive than others. Fischer (2017) finds evidence that women who are enrolled in a class with higher ability peers are less likely to graduate with a STEM degree, while men’s STEM persistence is unaffected. Buser et al. (2014) show that students in upper secondary school do consider some study tracks to be more competitive than others and that differences in girls’ and boys’ competition preferences can explain 20% of the gender differences in track selection. The peer gender composition might also affect competitive behavior. Gneezy et al. (2003) and Niederle and Vesterlund (2010) demonstrate that, in an experimental setting, when women compete against men, they perform worse than the men even if the perform similarly in non-competitive environments. When women compete against other women, however, their performance might improve. Booth and Nolen (2012) report evidence that girls who attend single-sex schools make similar competitive choices as boys.

For evidence on gender-stereotypical behavior and competitiveness as mediating mechanisms, we rely on survey evidence from The National Pupils’ Survey, linked to the registry data. This annual, nationwide survey elicits student opinions about learning and well-being at school. A question about whether the student has felt discriminated against because of his or her gender may detect expressed non-equal treatment by fellow students or the teacher. Because this discrimination is based on gender, it may be related to gender stereotypes. If perceived gender discrimination is negatively correlated to the share of female peers, we interpret this as suggestive evidence that the environment acts differently toward girls when they are fewer. If adhering to social norms about gender stereotypical behavior is related to educational choices, the intensity of gender discrimination may be a mediating mechanism by which peer gender composition affects choices. The Pupils’ Survey also has questions about preferences for competition. If the peer gender composition affects the competitiveness of girls and boys and in addition, educational choices are influenced by perceptions about how competitive different tracks are, competitiveness may be a mechanism by which the female share influences educational choices.

3 Norwegian educational system

Compulsory education in Norway consists of 7 years of primary and 3 years of lower secondary education. Since 1997, when the starting age was lowered from seven to six and years of compulsory education was increased from nine to ten, compulsory school normally runs from age six to sixteen. In our analyses, the last two cohorts (2007 and 2008) are part of the new regime, while the rest is part of the old regime. Both groups typically end compulsory schooling at age sixteen. Local municipalities govern and run compulsory schools, while the national government is responsible for upper secondary education. Regarding school-starting age, parents can apply to municipalities to start school early or delay school start by 1 year on pedagogical and psychological grounds. However, Norway enforces strict compulsory school enrollment rules based on year of birth, so changes to school start age are not common. Grade retention is even rarer in Norway (Strøm 2004; Bedard and Dhuey 2006), so almost all children start and finish compulsory school at the same age.

Enrollment in compulsory school is based on place of residence (school district), and there is no ability tracking. Classes are set at the beginning of primary school (1st grade) and again at the start of lower secondary school (8th grade), with the aim of balancing socioeconomic composition across classes for the entering age cohort. Regulations forbid permanently setting or changing class composition based on ability, gender or ethnicity (Opplæringsloven §8.1.) It is very uncommon to change the composition of classes after they are set in eighth grade unless the number of classes changes (Falch et al. 2017).

According to Norwegian lawFootnote 5 a student can under special circumstances be transferred to another school if the learning environment for the other students would benefit, e.g., because of misbehavior by the student being moved. However, other measures are considered before a student is relocated. Consequently, inter-school mobility of compulsory school students in Norway is low. The share of private lower secondary schools in Norway is also very low, approximately 4%, during our study period. Private schools are heavily subsidized and must follow national curriculum guidelines. Exams and methods of grading in lower secondary school are uniform across the country.

After graduating from compulsory lower secondary school, students can leave the school system or enroll into upper secondary school. More than 95% of youths in each cohort choose the latter alternative. Students must choose between an academic-oriented track (three available programmes), which generally leads to admission to university or college, and a vocational track (nine available programmes), which provides occupational certification. The academic program normally takes 3 years, while the vocational program usually consists of 2 years of classroom-based learning, followed by a 2 year apprenticeship. All youth have a legal right to attend at least 3 years of upper secondary school after having completed compulsory school.

4 Empirical strategy and data

4.1 Empirical model

The main estimated model is as follows:

$$\begin{aligned} {{Y}^{g}}_{isc}=\beta {\hbox {Girls}^{-i}}_{ics}+X_{ic}a_{1} +S_{sc}a_{2}+{\gamma }_{s}+{\eta }_{c}+{\varepsilon }_{isc} \end{aligned}$$
(1)

where \({Y}_{isc}^{g}\) is a measure of the educational choice at level g for student i from lower secondary school s in cohort c. \(\text {Girls}_{ics}^{-i}\) is the share of females in student i’s observed school cohort c, excluding student i. This variable takes a value between 0 and 1. \(X_{ic}\) is a vector of individual characteristics. \(S_{sc}\) is a vector of lower secondary school characteristics limited to the number of students in cohort c. In addition, we control for school- and cohort-fixed effects by \(\gamma _{s}\) and \(\eta _{c}\). The primary coefficient of interest is \(\beta \), which measures the impact of the share of female share in school s on cohort c.

In our analyses, the peer measure is from the last year of lower secondary school (10th grade), which is also the last year of compulsory education. Data on students and grades are only available for the last year because final exams are only registered at this point. The peers we observe in this year generally reflect the same group during all 3 years of lower secondary school. Following Lavy and Schlosser (2011), we measure female share at the grade level to reduce concerns of the endogeneity of composition at the class level since parents and school authorities may have some discretion in placing students in different classes within a grade. Due to the strict rules on gender balance in classes in Norway, there will be a high correlation between the gender balance in the class and gender balance at the grade level.

To interpret the results as causal, the key identifying assumption is that the variation in the female share from year to year within school is uncorrelated to other observed or unobserved factors which might affect students’ choices and outcomes. This assumption is ex ante fulfilled. The sex of a child is by nature random, and Norwegian youth must, with few exceptions, attend the school where they live.Footnote 6 The female share in a given school in a given year should therefore be exogenous to other factors that affect educational choices.

Although we believe the gender composition in schools to be exogenous, there is potential for manipulating the female share. Parents may advance or postpone school entrance for their child, and students may change schools. However, changes to school-starting age or student mobility are not in themselves a problem to our identification. They are only threats if they are related to the female share in a student’s school district/cohort. We try to address both concerns.

To address the problem of endogenous school-starting age, we employ an approach similar to Black et al. (2013), instrumenting the share of female peers in student i’s observed school cohort, \(\text {Girls}_{ics}^{-i}\), with the female share in student i’s birth year at school, \(\text {Girls}_{ibs}^{-i}\), i.e., with the female share that would have occurred if every student started on time. The first-stage relationship between these measures is very strong and highly significant because the share of students starting school in the expected year is high, and grade retention is very rare. Results from the first-stage regression are included in Table 6 in “Appendix.” To address the problem of endogenous movement between schools, we carry out several robustness checks in Sect. 7.

Table 1 in Sect. 4.2 presents balance tests (t tests) for both the observed female share and the instrument (predicted female share) and shows that the female share is almost entirely uncorrelated with either individual, family or neighborhood characteristics. This supports the identifying assumption that the female share in a given school year is exogenous to other factors that affect educational choices.

The exclusion restriction for our instrument is that the share of female peers in student i’s birth year do not affect educational choices of girls and boys directly, but only through the school cohort female share. This is not a testable assumption. Still, as our outcomes are choices in upper secondary school (where admission is not affected by gender), it is unlikely that the gender share in your birth cohort has a direct effect.Footnote 7

Table 1 Balance test

School-fixed effects are included in all regressions to control for time-invariant, unobserved school characteristics, but it is possible that there are time trends in unobservables that we do not capture in the main specification, e.g., if changes in school quality are in some way related to changes in the female share over time. We believe this is a minor problem and show in Sect. 7 that most of our main estimates are robust to school-specific time trends in unobservables. In addition, we discuss other potential time trends and conduct robustness checks for the sensitivity of our results to such time trends. An additional concern is that we overstate the significance of our results because we test multiple outcomes (see, e.g., Ding and Lehrer 2011). Our main results are in Tables 2, 3 and 4, and we have six tests in Table 2, six tests in Table 3 and four tests in Table 4. In Sect. 7, we therefore apply a Holm–Bonferroni test (Holm 1979) for Tables 2, 3 and 4 separately and report the adjusted p values in Table 14.

4.2 Data, variables and sample

Estimations of variants of Eq. (1) are based on a comprehensive set of individual registry data collected and maintained by Statistics Norway. The starting point is individual-level data from the Norwegian National Education Database (NUDB), which contains detailed longitudinal information on all students in lower secondary school (compulsory school), upper secondary school and higher education. NUDB data are linked via unique personal identifiers to other registries containing demographic information. The data include both public and private lower secondary schools, with private schools accounting for only 4% of students.

From this database, we use data on all students who completed lower secondary school from 2003 to 2008. For all cohorts of students, we link data on individual characteristics (fixed and time-varying), family and school-related characteristics, and information on peers (share of females in lower secondary school).

The key-dependent variables are measures of educational choices. We start by examining the choice of vocational versus academic track in upper secondary school. Conditional on choosing a vocational track, we include a dummy variable measuring whether the course is male dominated (70% or more male students). For the academic track, we include two binary choice variables indicating choice of elective courses: whether the student chooses a STEM elective course (mathematics, physics or IT) during any year in upper secondary school and whether the student chooses a foreign language elective course.Footnote 8 In addition, we construct two variables to measure “pure” STEM or language track decisions: “STEM-only,” taking the value of 1 if the student only chooses STEM elective courses (and no language courses) and 0 otherwise, and “language-only,” taking the value 1 if the student only chooses language elective courses (and no STEM courses) and a value of 0 otherwise.

In the section on mechanisms, we also include students’ grades as measures of performance. We include five measures from lower secondary school: math, science, Norwegian and English, as well as grade point average (GPA). In upper secondary school, we include grades from first-year math, which is an obligatory course. Grades range on a scale from 1 to 6, where 6 is the best score. In the estimations, we standardize the grade variables (z scores).

We use data from the annual National Pupils’ Survey to examine the potential impact of two important social mechanisms, discrimination and competitive behavior, on the classroom environment.Footnote 9 All estimations control for school-fixed effects and students’ background characteristics, as in previous models. However, the National Pupils’ Survey covers only the last two cohorts in the sample (2007 and 2008), which implies that identification through within-school variations is relatively less powerful in these samples.Footnote 10

Control variables include the individual student’s number of siblings, rank as the oldest child, immigrant status (non-western or western immigrant), and parents’ education and annual income. Parents’ education is measured using four dummy variables: 1) compulsory school; 2) upper secondary school; 3) college or university education; 4) unknown education level. Parents’ annual income is measured as the average annual total income from the period when the child was 7–16 years old. Control variables at the school level include the number of students in each cohort, in addition to the fixed-school effects. Information on teachers is unfortunately lacking in the data.

We also construct a set of neighborhood (defined by “grunnkrets,” the lowest-level regional unit in NorwayFootnote 11) characteristics to use in the balance test of the treatment variable. These are: the share of immigrants, the share with a university degree, the share working (measured by having labor income above a minimum threshold) and the share receiving welfare benefits.

Table 5 in the “Appendix” shows the mean values of the background characteristics and dependent variables of the sample. Girls and boys have, on average, almost identical individual characteristics, family backgrounds and live in similar neighborhoods. The similarity in background characteristics reflects the randomness in whether families have a boy or girl and supports an underlying assumption of the model: the female share in a grade at any given school is random. Girls score significantly better than boys in all measured subjects. Boys choose vocational studies more often than an academic track. Those boys who do choose an academic track, enroll in math and physics more frequently, despite poorer performance in these subjects.

Figure 1 shows the variation in the main variable of interest, the share of female peers in the student’s school cohort. The large majority of students are in a school cohort in which the share of girls ranges from 40 and 60%.

Table 1 shows the results from a balance test of our treatment variable, both for the observed and the predicted female share (Table 8 in “Appendix” shows the balance test for the 2007/2008 subsample), in which we regress the students’ family background variables, school characteristics (number of students in the cohort) and neighborhood characteristics on the share of female peers, \(\text {Girls}_{ibs}^{-i}\). This balance test shows that the female share is not significantly correlated with any of the students’ predetermined characteristics with the exception of father’s years of education for boys and neighborhood immigrant share in the neighborhood for both girls and boys. These correlations may be statistical noise, or they may reflect the slight differences between families with more boys compared to more girls in the population observed in Table 5. When we estimate the model, we include controls for these characteristics in addition to school- and cohort-fixed effects to ensure that our results are not driven by relationships between these background characteristics and educational choices.Footnote 12

Fig. 1
figure 1

Share of females in a school cohort. Notes: Kernel density estimate of the distribution of the key explanatory variable (the share of female peers estimated at the individual level)

5 Results

The main results are shown in Table 2. Each cell is a point estimate of the effect of the share of female peers on the outcome variable indicated in the left column: the choice of vocational track in upper secondary school, the choice of a male-dominated track among vocational track students, and the choice of STEM and foreign language courses among academic track students. As the descriptive statistics shown in Table 5, boys more often choose a vocational track in upper secondary school, and STEM courses if they choose an academic track, making these the most typical “male” choices. In the following discussion of the size of effects, we concentrate on the IV estimates, but include the OLS estimates for comparison.

Table 2 Effect of the female share on educational choices. OLS and IV estimates

For girls, having more female peers in lower secondary school has no significant effect on choosing a vocational track in upper secondary school, and the coefficient is not significantly different from the effect for boys. Among those who choose an academic track in upper secondary school, we find a positive effect of a higher share of female peers in lower secondary school on the likelihood of choosing STEM courses in upper secondary school. We find a negative effect of the female share in lower secondary school on choosing a foreign in upper secondary school. The negative effect on language is of the same size as the positive effect on STEM. This indicates that all of the positive effect on STEM comes from a negative effect on language and not other courses like social science, economics, history or philosophy.

In the last two rows, we shed light on the relative difference between STEM and language: the probability of choosing “STEM-only” (and no foreign language) elective courses and the probability of choosing “language-only” (an no STEM) elective courses. The results show that the likelihood of choosing “STEM-only” courses increases with the female share, while the likelihood of choosing “language-only” courses is reduced. These effects are similar to the overall effect of choosing STEM or language for girls, implying that all of the movement from language to STEM courses is a total shift in choices away from language to STEM. The coefficient for “STEM-only” suggests that a ten percentage point increase in the female share results in a 0.51 percentage point increase in the probability of choosing STEM courses relative to language courses, or approximately 1.6% change evaluated at the mean.Footnote 13

For boys, having more female peers in lower secondary school increases the probability of choosing a vocational track in upper secondary school. The coefficient suggests that a ten percentage point increase in the female share increases the probability of choosing a vocational track by 0.45 percentage points, or approximately 0.8% evaluated at the mean. The pattern with respect to choices in the academic track is quite similar to the results for girls. There is a positive, but not significant, effect of a higher share of female peers on choosing STEM, and a negative and significant effect on choosing foreign language, resulting in a positive relative effect of choosing STEM over language. The effect is larger and significant for the variable measuring the relative choice of STEM over foreign language. The coefficient for “STEM-only” suggests that a ten percentage point increase in the female share results in a 0.86 percentage point increase in the relative probability of choosing STEM relative to language studies, or approximately 2% change evaluated at the mean. Because the probability of entering a vocational track is significantly affected by the female share, the results for course choices within the academic track must be interpreted with caution, as boys attending the academic track may be a selected group.

The direction of effects is similar for boys and girls: girls choose more “female typical” when they are fewer; boys choose more “male typical” when they are fewer. There is no significant difference in the size of the coefficients between girls and boys which means that both sexes react equally strongly to the female share in their peer group.

Similar studies in the literature report findings that differ partially from our own. Schneeweis and Zweimüller (2012) is the study most similar to ours because they look at transitions from compulsory to secondary school. They find that increasing the share of girls by one standard deviation (11 percentage points) decreases the probability for girls of choosing a female school type in ninth grade by approximately 11%. Anelli and Peri (2019) find no significant effects on choice of college major for girls either in all-female or all-male classes in high school. Male students increase their probability of enrolling in predominantly male majors by approximately 13 percentage points when attending an all-male class. Brenøe and Zölitz (2018) find that a ten percentage point increase in the proportion of female high school peers lowers women’s probability of enrolling in STEM studies in college by approximately 7%. For men, a similar change in the gender composition raises STEM enrollment by approximately 2%. At the university level, Zölitz and Feld (2017) find that having a ten percentage point higher share of female peers in a given section reduces women’s probability of choosing a male-dominated major by approximately 8%, and increases their probability of majoring in female-dominated subjects by approximately 2%.

The studies above focus on different settings and somewhat selective samples, which may be one of the reasons for the diverging results. Schneeweis and Zweimüller (2012) study students in low track schools. Approximately 60% of the students choose a low track school and belong to the lower part of the ability distribution. Anelli and Peri (2019), on the other hand, studies a positively selected group in college preparatory high schools. Brenøe and Zölitz (2018) also study a positively selected group of students in the math track in high school, and also Zölitz and Feld (2017) study a positively selected group of students in their last year of their bachelor’s program. Taken at face value, therefore, it seems that increasing the share of female peers increases STEM participation for low ability female students (Schneeweis and Zweimüller 2012) but decreases STEM participation for high-ability female students (Brenøe and Zölitz 2018; Zölitz and Feld 2017). Our study group (all students that completed lower secondary school) is perhaps most similar to the Schneeweis and Zweimüller (2012) study, which may explain why our results are most similar to theirs. The fact that our study includes more high-ability students than theirs, may account for our smaller point estimates: If high-ability students react in an opposite manner from low ability students, the mean effect would be closer to zero.

6 Possible mechanisms of gender peer effects

6.1 Performance

As described in Sect. 2, one of the primary ways the peer gender composition may influence educational choices is through an effect on performance. Much of the previous literature has also focused on the effect of female peers on performance, finding positive effects for both girls and boys of having more female peers. We also estimate the effect of the female share on school performance in our data, to determine whether the effects on performance are in the same direction as the effects on choices. We estimate the effect on final grades in math, science, Norwegian, English and the full GPA in lower secondary school (the year we measure peer gender composition) and also include math grades from the first year of upper secondary school for those who choose an academic track.

The results are reported in Table 3. For grades in lower secondary school, we find that math grades among girls increase with a higher share of female peers. This result is consistent with previous studies. The effect on the overall GPA, however, is negative, which might be explained by negative effects on Norwegian and English grades, where the latter is not significant. Having more female peers seems to shift the effort from language (“feminine”) to math (“masculine”).Footnote 14

Table 3 Effect of the female peers on performance. OLS and IV estimates

The size of the effects on the math grade is relatively small. The IV coefficient for math grades in lower secondary school is 0.075, which means that a ten percentage point increase in the share of female peers will lead to an increase in the math grade of 0.75% of a standard deviation. The standard deviation for math is 1.13, implying that a ten percentage point increase in the share of female peers increases grades by 0.008 points. Measured from the average grade score (3.55), this represents an increase of 0.24%.

The female share also has a positive effect on the math grade in the first year of upper secondary school. The IV coefficient suggests that a ten percentage point increase in the share of female peers will lead to an increase in the math grade of 1.1% of a standard deviation. The standard deviation for math first year of upper secondary school is 1.29, implying that a ten percentage point increase in the share of female peers increases math grades by 0.014 points. Measured from the average math grade score for girls (3.31) this represents an increase of 0.42%.Footnote 15 The fact that we still find positive effects on math grades in the first year of upper secondary school, when students have entered a new school and the peer group has changed, suggests that more female peers have a persistent effect on performance, not just the relative grade in the class.Footnote 16

A higher share of female peers negatively influences boys’ grades, both in general (GPA) and specifically in lower secondary language courses (Norwegian and English). This is in contrast to much of the literature, which finds that a higher female share is beneficial for both boys and girls. However, this finding can explain why previous research on Norwegian data (Black et al. 2013) finds that a higher share of female peers negatively affects boys’ long-term outcomes.

Overall, a higher female share improves girls’ relative performance in math over language courses. As performance is an important predictor of later choices, the changes in relative grades are consistent with the effects we saw on choices in the previous section: a higher female share leads girls to choose more STEM courses and fewer language courses. Coefficient size is, however, small compared to the larger effects on choices. This may indicate that mechanisms other than performance could be involved.

6.2 Gender stereotypes and competition

In this section, we investigate the role of gender stereotypes and competition for educational choices. As described in Sect. 2, one reason students make gendered educational choices is that they want to adhere to gender stereotypes. Another potential explanation is that girls, more often than boys, shy away from subjects/educational tracks that they view as competitive. In this section, we estimate the effect of the peer gender composition on an individual’s perception of experiencing gender discrimination (which is a form of unequal treatment based on gender stereotypes) and on the individual’s willingness to compete. If the peer gender composition has an effect on these measures, they may be mediating mechanisms through which the female share affects educational choices. We estimate them in the same way as the effects of female peers on grades and choices, and they are as such valid estimates of the effect of the share of females on the average responses of boys and girls to the survey.

Table 4 shows the separate effects for girls and boys of female share on our two measures of classroom environment: gender discrimination and willingness to compete. Our results show that explanations related to gender stereotypes are a potential mechanism behind the effect of peer gender composition on educational choices. A higher share of own-gender peers is associated with less gender discrimination (“Have you been discriminated because of your gender?”) and less unjust treatment by students (“Have you been treated unjustly by your classmates?”). We find therefore that the gender composition affects how classmates treat each other, also explicitly based on gender. Girls and boys more often feel discriminated against because of their gender when they are in minority. This may involve different expectations from the teacher and fellow students about how a girl or a boy performs in different subjects or how they behave in class.

Table 4 Effect of the female share on discrimination and competitive behavior

In Sect. 2, we discussed literature reporting that girls’ willingness to compete increases when they compete with other girls. One might then expect that, when the female share increases, girls are more willing to compete. However, the two available measures (“I try to do better than the others students,” “I want to be best in something”) suggest no such relationship for girls, some of the coefficients even go in opposite directions. Boys, however, become more competitive when the female share increases—consistent with the theories that boys (and girls) take on more “masculine” (“feminine”) traits when they are in minority.

6.3 The relation between performance, discrimination and competitive behavior and educational choices

In this section, we explore how performance and the two measures of classroom environments are related to the choice outcomes. We regress performance and answers to the survey on educational choices, controlling for the same characteristics as in the previous IV regressions. The results can only be interpreted as correlations, but will indicate the direction of correlation between classroom environment measures and choices, and to what degree they may be candidates for mediating mechanisms. Results are presented in Table 10 in the “Appendix.”

The top panel shows how performance is correlated with educational choices. As expected, we generally find a positive correlation between performance and the likelihood of choosing the same type of subject in upper secondary school. For example, students who do well in mathematics in lower secondary school are more likely to pursue STEM subjects in upper secondary school, and more likely to pursue STEM instead of foreign language. These results apply for both boys and girls. The results regarding competition and discrimination reveal some interesting patterns that lend support to the hypothesis that some of the effect of the peer gender composition may operate through its impact on gender discrimination and/or willingness to compete. For girls, we find a negative correlation between gender discrimination and the choice of STEM courses in upper secondary school, while there is a positive correlation between gender discrimination and the choice of language courses. This indicates that gender discrimination is indeed correlated to gender-stereotypical educational choices and may be a mediating mechanism.

For boys, we find a positive correlation between stated willingness to compete and choice of STEM courses in upper secondary school, while the correlation with choice of language courses is negative. This suggests that one of the mechanisms behind the impact of a higher female share on boy’s educational choices is that boys are more willing to compete.

These results are primarily suggestive of areas for future research. Although we find some evidence that gender composition affects the classroom environment, gender discrimination and willingness to compete, there are still many factors we do not capture in our data, e.g., questions related to self-concept, self-image and gender norms.

7 Robustness

In this section, we carry out several robustness checks to test whether our main results for grades and choices are sensitive to time trends in unobservables, to school switching and to school size.

7.1 Time trends in unobservables

There is a potential concern that trends in the female share within a school might be related to unobservable characteristics that could influence school choices. One such trend might be a trend in preferences for having boys rather than girls. Preferences for boys over girls have been shown to influence the stopping rules of families (Dahl and Moretti 2008) and rates of elective abortion, especially in some Asian countries (Sen 1992), and therefore could affect the gender balance at the aggregate level. In Norway, Lillehagen and Lyngstad (2014) find differences in sex preferences in the immigrant and native populations which result in different fertility behaviors. In our Table 5, we also find a disproportionate number of boys in non-western immigrant families (although the difference is small). If the composition of families in a school district changes over time, preferences for girls or boys might also change, leading some schools to have a higher female share compared to others. The differences in fertility behavior across immigrant and native groups are, however, so small that we believe this to be a minor problem.Footnote 17

We undertake four robustness checks to test this concern. First, we estimate the model, including a moving average of the background characteristics of the student’s peers (similar to Black et al. 2013). This approach allows us to control for changes in observed school quality over time. For each year, we regress the outcome of a student in that year, conditional on the average characteristics over that year, as well as the preceding and succeeding years, i.e., we use a 3-year moving average. Second, we estimate a model controlling for school-specific, linear time trends. This allows us to control for time trends also in unobservable characteristics at the school level. Third, we follow Schneeweis and Zweimüller (2012) and estimate the model excluding schools with a significant time trend in the share of female peers.Footnote 18 The results are reported in Table 11 in the “Appendix”. Finally, in columns 1 and 2 in Table 12 in the “Appendix” we present results using an alternative instrument. The alternative instrument is constructed from the original instrument, the share of female peers in student i’s birth year, but using only the residual idiosyncratic variation after having regressed the instrument on school-specific time trends.

The results in Table 11 are quite similar to those of the baseline model in Table 2. Some coefficients, however, are reduced in size and/or are no longer significant. For girls, we no longer find any significant effect of the female share on math grades in lower secondary school. They are still positive, but somewhat smaller and more uncertain. The effect of the share of female peers on the math grade in the first year in upper secondary school is still significant and stable across specifications.Footnote 19 In the analysis using the alternative instrument (Table 12), we find very similar results to the baseline specification. Overall, the results from moving averages and excluding time trends are similar in the direction of the effects, indicating that trends in the female share in schools over time are not important for the effects identified in the baseline model in Table 2.

7.2 Switching schools and school size

Although there is little possibility of changing classes in the Norwegian school system, there may still be a problem if there is selective switching of schools that differs between boys and girls. The female share that we measure in tenth grade may then be the result of previous school switching during lower secondary school (8th–10th grade). We undertake several robustness checks to test this concern.

First, if there had been endogenous sorting of, for example, boys from low socioeconomic backgrounds into particular schools, the balance test in Table 1 would have indicated that boys with less educated parents were disproportionately in schools with a lower female share. There is little evidence of this in Table 1. There is a significant relationship between father’s years of education and the female share in the boy’s sample, and the share of immigrants in the neighborhood for both boys and girls. We do, however, control for these characteristics in addition to school-fixed effects and cohort-fixed effects, which should take out any sorting related to these variables in the estimations.

We also check whether there is any difference in background characteristics of girls and boys who move between municipalities, and whether there is any difference between girls and boys who go to a different school than their residential address would suggest. Of the approximately 320,000 students in the sample, only 4.7% moved to another municipality during lower secondary school. There are no indications that boys move more frequently than girls: 50.3% of movers are boys, and 49.7% are girls. Those who move exhibit no gender differences in socioeconomic background, beyond what is reflected in the total population in Table 5. Therefore, there does not seem to be selective moving of students between municipalities related to children’s sex.

We do not have panel information on residential address within the municipality, so we cannot directly analyze whether students change schools during lower secondary school. However, for the last year of lower secondary school, we have detailed information about students’ neighborhoods within the municipality (“grunnkrets”) and unique school identifiers. Based on this information, we calculate the share of students who attend a lower secondary school other than the school in which the largest share of students in the neighborhood is enrolled. This calculation can serve as an indicator of the share of students who opt out and choose another school. Similarly to the families who move across municipalities, the female share in these two types of schools is almost identical, as are the socioeconomic background variables. Therefore, there is also no evidence of gender-specific selective moving between schools within municipalities. We nevertheless estimate the model, using the same approach as with endogenous school start: instrumenting the share of female peers in student i’s school cohort with the share of female peers among the students who attend the expected school based on their residential address. As shown in columns 3 and 4 in Table 12, the estimates are robust to this alternative instrument. Indeed, for many outcomes the effects are even stronger and more precise.

We also test the sensitivity of the results to the inclusion of small schools in our sample. There is a degree of grade mixing in small schools in Norway (Leuven and Rønning 2014) which might result in measurement error in the female share variable in small schools. Leuven and Rønning (2014) find that the probability of grade mixing drops sharply when there are more than 16 pupils at the grade level. We thus estimate the model on schools that have more than 16 students in a grade. We see in Table 13 in the “Appendix” that excluding small schools gives similar results to the baseline model.

7.3 Multiple testing corrections

In this section, we apply a Holm–Bonferroni test (Holm 1979) of multiple outcomes in Tables 2, 3 and 4 and present the adjusted p values in Table 14 in the “Appendix”. The results in the table are ordered according to the original p values for girls in Tables 2, 3 and 4. The tests are, however, run separately for girls and boys with their respective ordering of significance levels.

The first panel shows the original p values from Table 2 and their Holm–Bonferroni adjusted p value counterparts. For girls, only one result in the original table is significant at a 5% level (3 at the 10% level). None of these results are significant in the Holm–Bonferroni test. For boys, the original results are more significant: 3 outcomes are significant at the 5% level (or lower) and one at the 10% level. Only the initial result that was significant at the 1% level passes the Holm–Bonferroni test at a 5% level. Two more results pass the Holm–Bonferroni test at a 10% level.

The second panel shows the p values from Table 3 and their Holm–Bonferroni adjusted counterparts. For girls, three of the original estimates are significant at the 5% level (or lower), but only the effect on the Norwegian grade passes the Holm–Bonferroni test. The results for boys also are more robust for school performance. All three results that are significant at a 1% level pass the Holm–Bonferroni test at a 5% level or lower.

Finally, in the third panel, we test the discrimination and competitive behavior outcomes. The results that were significant at the 5% level (or lower) are still significant after the Holm–Bonferroni adjustment for both boys and girls at the 10% level. At the 5% level, however, only the results for girls are still significant.

In summary, conducting the Holm–Bonferroni test in Tables 2, 3 and 4 produces mixed results, especially for girls’ results in Table 2. However, we will argue that one should interpret the tests from the Holm–Bonferroni test procedure with some caution. First, although the test reduces false positives, this is at the potential expense of false negatives. There is no unambiguous consensus on which result is more problematic. Secondly, we have demanding specifications, controlling for both school- and cohort-fixed effects, potentially affecting the standard errors in the estimations. Thirdly, it is not obvious how to group the hypotheses to be tested together, as some of them are measured at different times in the educational career. Finally, we have conducted other robustness checks in Sects. 7.1 and 7.2 to validate the results for theoretical potential biases, and the results from these tests are reassuring.

8 Conclusion

In this paper, we use the idiosyncratic variation in gender composition across cohorts in Norwegian lower secondary schools to analyze the impact of the share of female peers on educational choices. We use the gender composition in lower secondary schools because peers at this level are governed by institutional rules and regulations. Completing lower secondary schools is mandatory, all students follow the same track, and school districts adhere to geographical areas, producing random variations in the female share in each school cohort.

The results reveal that an increase in the share of female peers makes both girls and boys more inclined to choose relatively more STEM and fewer language classes in upper secondary school. The probability of choosing a vocational track also increases (but only significantly for boys). The similar pattern for boys and girls may suggest that they both react to the gender composition of their peers by making more “gender typical” choices when they are in the relatively smaller gender group.

The results are generally robust across different specifications controlling for time-varying school quality indicators, and for different versions of instrumenting the main variable.

Our results suggest that mediating mechanisms are related to school performance and gender stereotypes. A higher share of female peers has a positive effect on girls’ performance in math, which in turn is associated with a higher likelihood of choosing STEM-related subjects in upper secondary school. Another potential mechanism seems to be less gender discrimination felt by girls when the share of female peers is higher. For boys, on the other hand, increased willingness to compete when the share of female peers increases seems to be an important mechanism.

These results indicate that policies to improve girls’ relative performance in STEM subjects compared to language courses may have an impact on their probability of choosing more STEM courses in later education. Reducing discrimination based on gender may also increase the probability that girls choose STEM courses. Our results show indications that gendered social mechanisms may have long-term effects on educational choices and, eventually, on labor market opportunities for boys and girls.