Introduction

Gender differences in the choice of major occur in almost all industrialized countries (Barone, 2011). One frequently observed pattern is the overrepresentation of women in fields such as humanities and pedagogics, whereas men are overrepresented in science, technology, engineering and mathematics, with the proportion of one gender reaching more than 90% in some study programmes (Destatis, 2019; Liu & Zuo, 2019). These gender inequalities in the choice of major have consequences that transcend the educational system itself, as educational choices are important determinants of future employment and earnings (Gerber & Cheung, 2008; Horowitz, 2018; Jacob & Klein, 2019; Smyth, 2005). Consequently, a large body of literature on higher education has investigated the link between gender and field of study intention and choice (Bieri et al., 2016; Jonsson, 1999; Mann & Diprete, 2013; Morgan et al., 2013; Ochsenfeld, 2016; Reimer & Steinmetz, 2009; Silander et al., 2022; Xie et al., 2015), as well as the degree to which interventions can mitigate these gender differences (Barone et al., 2019; Finger et al., 2020; Scheeren et al., 2018).

This paper adds to this discussion in two ways. First, we build on psychological research that consistently documents gendered differences in risk aversion and return preferences (Niederle & Vesterlund, 2007; Niederle & Vesterlund, 2011; Paola & Gioia, 2012; Sutter et al., 2016; Sutter & Glätzle-Rützler, 2014) and argue that these differences may partly explain gendered differences in higher education choices. This argument is motivated by descriptive evidence showing a strong correlation between the risk–return profile of a certain study programme and the proportion of men to women in the programme. Study programmes with a high share of male students, such as engineering or computer science, offer the most favourable employment prospects after graduation but are also characterized by very high failure rates, while study programmes with a high share of female students often show the exact opposite pattern (for data on Germany, see Neugebauer et al., 2019).

Second, we assess whether the role of risk and return preferences permits policy interventions to mitigate gender differences. If a high (perceived) risk of failure deters female students from pursuing study programmes that yield higher employment prospects, counselling that strengthens self-confidence and fosters problem-solving skills for study-related difficulties could increase enrolment in these types of study programmes, thus reducing gender disparities in the choice of major. As girls tend to underestimate their skills, particularly regarding ambitious educational paths, counselling could indirectly mitigate gender differences by correcting these gender-biased ability beliefs, which have been documented by several studies (Marshman et al., 2018; mathematics, Perez-Felkner et al., 2017).

Methodologically, neither question is trivial to answer. First, study programmes differ in many respects beyond their risk–return profiles. Therefore, the observed correlation between risk–return profile and the share of male and female students may be spurious. Second, drawing causal inferences on the moderating effects of counselling may be complicated by endogenous self-selection into counselling programmes, which undermines the comparability between participants and non-participants (Imbens & Rubin, 2015). Therefore, we conducted a survey experiment (factorial survey) embedded in a field experiment. In the first step, we recruited high school students and divided them into treatment and control group. The treatment group was invited to participate in a counselling workshop, while the control group was compensated through participation in a prize draw. In the second step, we conducted a factorial survey a few months after the workshop and prize draw. We asked participants to rate the attractiveness of study programmes, experimentally varying the failure rates and expected income after graduation while holding all other parameters constant. This research design enabled us to assess whether risk–return profiles exert different influences on boys versus girls and identify the possible mitigating role of the counselling workshop.

Theory and related work

Risk, return and the role of counselling

Rational choice theory suggests that individuals will select the educational pathways promising the highest utility. Two major determinants of utility are the (perceived) benefits of a certain choice (return) and the probability of success (risk). Individuals will, therefore, opt for an educational pathway that maximizes utility by offering the most favourable risk–benefit ratio (Breen et al., 2014; Breen & Goldthorpe, 1997; Gabay-Egozi et al., 2010; Tutić, 2017). Building on and extending this framework, we argue that individual utility depends not only on objectively measured risk and returns but also on individual risk and return preferences. For example, risk-averse individuals are more likely to refrain from pursuing challenging educational paths (such as those with high failure rates), even with accurate information. As laboratory experiments in economics and psychology indicate substantially higher risk aversion among women (e.g. Niederle & Vesterlund, 2007), women may be more strongly deterred by fields of study that present a higher risk of failure. Similarly, men could be disproportionately attracted by higher returns as they tend to emphasize income and career (Busch-Heizmann, 2015; Jurczyk et al., 2019; Wolter et al., 2019).

This raises the role of counselling. We suggest that interventions could (indirectly) mitigate perceived risks by fostering students’ confidence in their abilities and discussing problem-solving strategies to address study-related difficulties. This could lower the perceived risk of failure in a certain study programme, thereby mitigating the deterring effect of higher failure rates and encouraging riskier and, possibly, more rewarding choices (we provide a detailed description of the content of the intervention in the “The intervention” section). While interventions to lower perceived risk could be expected to increase the probability of making risky choices for both boys and girls, there are reasons to expect that the effect will be larger for girls. First, as girls display lower average levels of risk affinity than boys, they may profit more from counselling due to ceiling effects at the top of the risk-affinity distribution. Second, girls might be more responsive to feedback regarding their risk evaluations, irrespective of their starting level, because they generally underestimate their competencies in various ambitious study programmes, such as those in STEM fields, compared to boys (see, e.g. physics, Marshman et al., 2018; mathematics, Perez-Felkner et al., 2017), leaving more room to correct their downward-biased risk perceptions. Therefore, boys and girls are expected to become more similar in their subjective evaluations of a given choice, which would contribute to less gendered choices of major. In this study, we focus on risk, as our counselling approach targets risk preferences and perceptions rather than return preferences. In sum, we hypothesize that:

  • (H1) Girls are more strongly guided by risk attributes while boys emphasize return characteristics when making educational choices.

  • (H2) Participation in counselling interventions mitigates the negative impact of higher risk on individual utility.

  • (H3) The effect of a counselling intervention on taking riskier choices will be stronger for girls.

Related work

At a general level, this paper contributes to several fields, including gender inequalities in higher education (Buchmann et al., 2008; Cech et al., 2011; Herd et al., 2019; Mann & Diprete, 2013; Morgan et al., 2013; Schwerter & Ilg, 2021), as well as the role of risk and return preferences or personality traits for study and career choices (Breen et al., 2014; Buser et al., 2014; Chen & Simpson, 2015; Daniel & Watermann, 2018; Finger, 2016; Sanabria & Penner, 2017; Sax et al., 2017). More specifically, we add to the policy-oriented literature that assesses the effects of interventions on gendered differences in higher education choices. Previous research in this field has mostly provided short info-treatments on actual risks and returns, seeking to correct social or gender-specific misperceptions about the risks and returns of certain educational pathways (Barone et al., 2016, 2019; Bleemer & Zafar, 2018; Callender & Melis, 2022; Ehlert et al., 2017; Evans & Boatman, 2018; Finger et al., 2020; French & Oreopoulos, 2017; Herbaut & Geven, 2020; Ruder & van Noy, 2017). While these interventions appear to reduce social inequalities with respect to socioeconomic background (for a review, see French & Oreopoulos, 2017), there is no evidence that they reduce gender inequalities. If at all, the effect appears to be stronger for boys. One reason for this finding could be that perceptions of actual risk and returns do not differ between genders, even without counselling workshops (as Barone et al., 2019, reported). Consequently, the approach of correcting gender-specific concepts may be of limited utility. At the same time, our theoretical reasoning suggests that men and women may differ more strongly in their risk and return preferences rather than their knowledge of actual risks and returns. Therefore, we extend the previous research by analysing the impact of an intervention that (indirectly) targets the perceived risk of failure.

Research design

Essentially, our research design combines a survey and field experiment. The field experiment consists of a counselling workshop, with participants randomly assigned into treatment and control group (randomized controlled trial, RCT), while the survey experiment consists of a factorial survey conducted after the counselling workshops. For the RCT, we recruit high school students from the area surrounding two large German cities who are between 6 and 18 months from graduating high school (i.e. attaining the German higher education entrance diploma, Abitur). During recruitment, we point out that participation would consist of multiple online surveys with monetary compensation (10 euros each). Moreover, we indicate that a randomly allocated subset of study participants would be invited to participate in a university guidance and counselling workshop offered by the Department for Student Services. Those who were not offered the workshop would take part in a raffle where five prizes of 100 euros could be won in addition to the monetary compensation for the surveys. The recruiting strategy for our study was similar to the approach typically employed for such workshops, e.g. providing flyers to high school students or contacting schools to relay information on guidance workshops to their students. Therefore, the resulting sample closely mirrors the target group that would usually participate in such interventions.

After registering for the study, participants complete an online questionnaire (first wave of the survey) that mostly consists of pre-treatment covariates. At the end of the first survey, participants are randomly assigned to either the treatment or control group, and only members of the treatment group were offered places in a 1-day workshop. Between 3 and 6 months after the end of the first survey, participants are invited to complete a second online survey, which contains a survey experiment assessing the importance of risk and return characteristics for study programme choice (see Fig. 1). This setup enables us to analyse whether the importance of risk and return varies between genders, whether participation in the workshops has a moderating effect on the importance of risk, and whether this moderating effect is stronger for girls. We summarize the research design in Fig. 1 and outline its components in more detail in the following sections.

Fig. 1
figure 1

Visualization of research design

The intervention

The intervention consists of a 1-day counselling workshop and takes place between the two survey waves. The workshop was conducted by professional counsellors from two universities and consists of two modules offering general information as well as psychological counselling. First, students complete three exercises designed to increase their awareness of their cognitive abilities, occupational interests and personal values. For example, they receive individual feedback on a self-assessment of their cognitive abilities that they completed before the workshop. These exercises constituted the basis of the discussion about suitable majors for the students. Second, students receive general information about study opportunities, such as the differences between universities and universities of applied sciences or access to reliable sources of information. More importantly in our context, the third part of the workshop consists of psychological counselling dedicated to strategies and resources for handling problems in possible study programmes; this stage includes an already enrolled student. The student provided examples of their experiences along with the difficulties they encountered during the study programme and described strategies to handle those issues. Additional individual exercises and group work encouraged participants to think about how to handle difficulties by discussing their past experiences and future applications. The general notion conveyed by the intervention is that while studying may present several challenges and demands, students should not be deterred from pursuing their preferred study programmes, as resources and problem-solving strategies are available to help overcome difficulties. The workshop did not target a specific study programme but emphasized that participants should choose based on their preferences.

Consequently, workshop participation can alleviate the deterrent effect of a higher average failure rate because participants become aware of the resources and strategies available to them in case of difficulties. In addition, they will have encountered an experienced student who has overcome difficulties encountered during their studies, increasing the credibility of the information they receive.

The factorial survey

The factorial survey was conducted in the second survey wave, after the intervention. It was designed to vary the risk–return pattern of potential study programmes while keeping all other factors fixed. This allows us to disentangle the impact of risk and return characteristics on the attractiveness of a certain choice from possible confounders. To increase the robustness of our findings, we ask two types of questions to observe participants’ risk and return preferences. We use a factorial survey offering different vignettes of risk and return profiles and asking participants to rate the different programmes, as well as a choice question that asks participants to decide between specific study programmes. We adopt these two approaches to confirm whether our results could be sensitive to different survey designs.

In the vignette study, we present participants with the following scenario and ask the following question:

"Imagine that you already know which major you wish to study. The major is offered at three universities nearby. The study programs differ with respect to dropout rates and expected income after graduation. Assume you can study your preferred study program at University (A/ B/ C). The average income after graduation is (32,000 / 43,000 / 54,000€), the failure rate is (22 / 31 / 40%).

On a scale from 0 to 10, how likely will you apply for this study program?"

By stating that the participants can study their preferred major at some university not further specified, we reassure that systematic variations in programme attractiveness reflect only the differences in risk and return outlined in the description of the study programme. This, therefore, fulfils our goal of creating exogenous variation in the risk–return pattern of a certain study programme that is not confounded by other characteristics, such as gender-specific vocational interests. Importantly, the experiment is also not confounded by the participants’ actual study intentions because the experimental design rules out factors such as students with more ambitious study intentions or different return preferences receiving higher values in their vignettes. While the effects may vary between students depending on their study intentions, the estimation of the average effect across the sample remains unaffected. At the same time, the intervention may affect study intentions; therefore, participants from the treatment group might have a slightly different set of study programmes in mind. However, the analysis relies on within-respondent variation (the fixed-effects analyses exclusively assess within-respondent variation), implying that survey participants have the same major in mind when responding to the vignettes.

Every participant responded to three vignettes. As Table 1 shows, we use three levels for the two dimensions, “risk” and “return”, generating a vignette universe of nine. We follow an algorithm suggested by Nguyen (2001) to assign the vignettes to three blocks. The exact blocking used in this study is presented in Table A.5. Each participant is randomly assigned to one block. As we are particularly interested in the interaction between the vignette dimensions and individual-level variables (treatment status and gender), we could limit the vignette dimensions to two options with three levels each.

Table 1 Description of vignette levels and dimensions

As a second measure of participants’ risk and return preferences, we asked participants a simple choice question:

“Assume you can study your preferred study program at University A and University B. At University A, the failure rate is 27%, the average income after completion is 34,000€. At University B, the failure rate is 39%, the average income after completion 47,000€.”

Which option do you prefer?

The choice question also fulfils our main purpose of creating exogenous variation in the risk–return profile. Varying risk and return simultaneously prevents the disentangling of the dimensions and tests them jointly. As this question focuses on both dimensions simultaneously, it is less relevant to the analysis of the treatment, which was not intended to affect return preferences. Nevertheless, the choice question is a reasonable method to assess the role of gender in choosing high-risk high-return vs. low-risk low-return programmes.

Data and variables

Our approaches produce different data structures. The vignette study produces a clustered dataset with vignettes clustered by person, with each observation including the corresponding rating and vignette variables (risk/return). The choice question results in a cross-section dataset including a variable of choosing University A or B (covariates are measured before the start of the treatment, and the survey experiment was conducted afterwards. As most of the covariates measured before the treatment are time-invariant, we are de facto dealing with a cross-section dataset including a variable of choosing University A or B). Both datasets contain an exogenous treatment indicator, an indicator for actual treatment status (not identical due to non-compliance) and the covariates measured in survey wave one, i.e. before participation in the counselling workshop. These include participants’ age in months, gender, school grade in maths and German and parents’ educational background (a dummy variable indicating whether both parents hold university degrees). Moreover, we coded two dummy variables for household composition (living with both parents and having siblings) and the share of schoolmates who planned to enter university. Finally, we included an indicator for the self-reported start of gathering information on study opportunities. As this is an experiment, controlling for confounders is less relevant for the treatment effect estimations, but it may be helpful to assess how sensitive the gender effect is to the inclusion of covariates.

In total, 725 high school students registered for our study and completed the initial survey before being assigned to experimental groups. Of those, 608 students participated in the second wave, which is a response rate of more than 80%. Panel attrition was strongly reduced by inviting participants via e-mail, text message and up to 20 telephone calls. The remaining panel attrition was mostly limited to participants with invalid phone numbers or who never answered the phone. Data cleaning (e.g. due to missing information on important covariates) reduced the sample to 580 cases. Due to randomization, no selection bias towards treatment (i.e. covariate imbalance) is apparent. A summary of the participants’ descriptive statistics is provided in Table A.1. The two most remarkable findings are a strong overrepresentation of girls (about 75%) and, on average, a certain preference for high-risk high-return choices (also about 75%). The overrepresentation of girls is consistent with experiences from similar workshops, reinforcing that we are utilizing a sample that is similar to the real-world conditions of such workshops but not representative of all students. Moreover, we must account for two-sided non-compliance in the estimations (see the “Estimation, inference and robustness checks” section), although the compliance rates are rather high (almost 80% in the treatment group and more than 95% in the control group).

Estimation, inference and robustness checks

We made our estimates in two steps. First, we relied on the experimental vignette study and the choice question to describe gendered differences in risk and return preferences. In the second step, we jointly considered the results from the survey and the field experiment to assess the counselling workshop’s potential to reduce the impact of risk preferences on students’ study programme choices. This allows our estimation strategy to account for the peculiarities of the two experiments’ respective data structures.

Gender and risk/return

To analyse gendered differences in risk–return preferences, we start by analysing the data from the vignette study. To do so, we run random effects multilevel regressions to account for the nested data structure of ratings on vignette variables (risk/return) and covariates. To assess the differences in the importance of risk/return between genders, we additionally insert cross-level interaction terms between gender and risk/return.

$${y}_{it}={\varvec{x}}^{\varvec{{\prime }}}\varvec{\beta }+{\beta }_{1}female+{\beta }_{2}risk+ {\beta }_{3}return+{\beta }_{4}female*risk+{\beta }_{5}female*return+ {\alpha }_{i}+{\epsilon }_{it}$$
(1)

As stated in the previous section, we confirmed the robustness of our approach with the choice question. Due to the cross-sectional data structure, we simply regressed the dummy variable indicating the high-risk high-return option on gender using linear probability models (LPM) with robust standard errors. To assess the sensitivity of the effect of gender, we run additional regressions with different sets of control variables.

The moderating role of the counselling workshop

In the second part of the analysis, we focus on the moderating role that treatment participation plays in students’ risk preferences. To do so, we focus on the factorial survey, as the workshop is expected to affect risk preferences but not return preferences. Therefore, we do not expect an impact on the binary choice question, although we briefly report the results of that analysis as well. Generally speaking, the approach for analysing the moderating role is very similar to the first part, except we exchange the risk/return × gender interaction for the risk × treatment interaction (see Eq. 2).

$${ y}_{iv}={\varvec{x}}^{\varvec{{\prime }}}\varvec{\beta }+{{{{\beta }_{1}return}_{iv}+ {{\beta }_{2}risk}_{iv}+ \beta }_{3}treat}_{i}+{{\beta }_{4}risk}_{iv}\times{treat}_{i}+ {\alpha }_{i}+{u}_{iv}$$
(2)

Here, we have to address an endogeneity issue caused by two-sided non-compliance; that is, a certain proportion of the invited participants did not attend the workshop (about 20% of those who were assigned to the treatment group), while ten participants who were assigned to the control group achieved placement in the workshops by registering for the study multiple times (non-compliance rate 3%). Excluding these participants completely would have biased the results, as (apparently) particularly motivated participants from the control group would have been excluded, generating a correlation between treatment assignment and unobserved confounders (per protocol analysis; Imbens & Rubin, 2015). However, actual treatment status is not randomized anymore and therefore possibly endogenous. Therefore, we use treatment assignment and the treatment assignment × risk interaction as instruments for actual treatment status and the treatment status × risk interaction, and estimate Eq. 2 by generalized least squares instrumental variable (GLS-IV) estimation (Imbens & Rubin, 2015). We estimated this equation for the whole sample as well as for both genders separately to analyse gendered differences in the moderating role of the treatment.

Robustness checks

To further substantiate the robustness of our results, we conduct two additional robustness checks. First, we conduct an intent-to-treat (ITT) analysis instead of our IV approach. This involves estimating the moderating effect of treatment assignment rather than actual treatment status to circumvent the problem of non-compliance. Second, we replace the GLS-IV regressions with FE-IV regressions to re-confirm the reliability of our research design.

Most importantly, we check whether endogenous panel attrition induces a correlation between treatment assignment and confounders among participants who participated in the second survey wave. To this end, we first run a selection regression (among all participants from wave 1) of participation in wave 2 on treatment assignment on all covariates. To further confirm whether attrition leads to covariate imbalance, we conduct a multivariate balancing test among second-wave participants by regressing treatment assignment on all covariates. Finally, we include a full set of treatment × covariate interactions in the selection regression to explicitly test for different selection patterns between groups.

Results and discussion

Risk, return and gender

We start by presenting the first part of our analysis, the description of gendered differences in risk and return preferences. First, we look at descriptive statistics on the relationship between the vignette dimensions and the outcome. As Fig. 2 shows, the relationship follows the expected pattern. The left panel shows a decreasing average rating for higher failure rates. Similarly, the average rating strongly increases for higher expected income.

Fig. 2
figure 2

Descriptive results for the relation of vignette variables and the outcome

We proceed with the results of the regression analyses. Table 2 summarizes the results of the factorial survey. Models 1 and 2 present regressions on the rating of the vignette variables with (Model 2) and without (Model 1) covariates, while Model 3 includes the interaction terms. In all models, the vignette variables enter as dummy variables with the lowest category as the reference category.

Table 2 Effect of risk and return by gender

Models 1 and 2 confirm that the effect of the vignette variables is significant at the 1% level. Theoretically speaking, the most interesting results are those from the interaction terms model. To facilitate interpretation, we present the coefficients in Fig. 3.

Fig. 3
figure 3

Visualization of gender × vignette interactions. The corresponding regression is displayed in Table 2, model 3

The displayed coefficients show the estimated effect of moving from the lowest to the medium or highest category separately for both genders. For both dimensions, the observed gender differences are consistent with our theoretical expectations but differ in magnitude as well as statistical significance. As the left panel shows, boys and girls assign (ceteris paribus) lower ratings to study programmes with higher failure rates, but not to the same degree. While there is no difference in the medium category, the difference in the highest category confirms that girls react more strongly than boys to higher failure rates. The point estimates (see Table 2) imply a negative effect for boys of − 1.3, whereas the effect for girls is − 1.7. This amounts to a relative difference of about 30%, meaning the negative effect is 30% stronger for girls than boys. The interaction effect barely misses statistical significance (t-value, − 1.49). The opposite picture is revealed for returns, as shown in the right panel. Once again, both genders assign higher ratings to study programmes with higher expected income, with larger gender differences in the highest category. The positive effect of going from the lowest to the highest category is 4.4 for boys, but only 3.4 for girls, which means that the effect is, again, about 30% greater for boys. While the gender difference revealed here is comparable to that for risk, the estimated difference is significant at the 1% level. Altogether, these results confirm the hypotheses that girls are deterred more strongly by higher failure rates, whereas boys are disproportionately attracted by higher income potential in their selection of specific study programmes.

To reaffirm the robustness of our results, we proceed with the results from the LPMs that regress the choice of the high-risk high-return option on gender plus different sets of covariates. The results are shown in Table 3. The coefficient of female expresses the difference in the probability of choosing the high-risk high-return option, compared to boys, in percentage points. In all models, the effect is significant at the 1% level and the effect size ranges between 12 and 15% points. Notably, the effect does not consistently diminish when covariates are included, suggesting that gender differences are not driven by omitted (unobservable) variables. As expected, treatment status does not affect the chosen programme as it constitutes a combined risk and return measure and there is no theoretical reason to believe that the treatment affects return preferences. Taken together, both parts of our experiment confirm our first hypothesis, that risk and return preferences vary along gender lines and may play an important role in students’ choice of major.

Table 3 Effect of gender and treatment status on choosing high risk/high return study program

The moderating effect of the treatment

We now examine whether the counselling intervention can mitigate the dissuasive effect of higher failure rates among high school students in general and girls in particular. We proceed by presenting the results from the risk-treatment interactions as outlined in Eq. (2). Table 4 displays the results for the entire sample (Model 1) as well as boys alone (Model 2) and girls alone (Model 3). We display the coefficients of the interaction term for Model 1, including both genders, in Fig. 4.

Table 4 Intention to enroll in study program regressed on risk characteristics moderated by treatment status
Fig. 4
figure 4

Mitigating role of counselling for the effect of risk aversion on rating (whole sample). Corresponding regression is given in Table 4 (model 1)

Participation in the counselling workshop appears to considerably mitigate the negative effect of higher failure rates, especially in the medium category. The negative risk effect of − 0.83 shrinks to − 0.29 with treatment participation, implying that the deterrent effect decreases to almost one-third of its original size (significant at the 10% level). This confirms our second hypothesis that the counselling workshop made participants less sensitive to higher failure rates for medium- vs. low-risk options. When interpreting the moderator, it should be noted that its maximum positive effect is limited by design, as higher failure rates can (ceteris paribus) rarely be regarded as positive. This demonstrates the volume of the estimated decrease.

We further display the coefficients from the regressions for both genders separately in Fig. 5. As the left (girls) and right (boys) panels indicate, the intervention alleviated the negative effect of higher failure rates for both genders, once again especially in the medium category. However, the difference between the two groups belies our theoretical expectations. Contrary to our hypothesis, the difference in point estimates between the treatment and control groups is larger for boys for both risk categories. For example, for girls without treatment, the rating of a study programme decreases by 1.796 scale points when the risk increases from the lowest to the highest category. This deterrent effect slightly decreases to − 1.660 with treatment. In contrast, for boys, the deterrent effect decreases from 1.637 in the control group to 0.824 in the treatment group. This corresponds to a decrease of 0.28 standard deviations (SD = 2.91) for boys compared to a decrease of 0.05 standard deviations for girls (SD = 2.74). While this difference is non-negligible, it does not reach statistical significance due to the small number of cases in each subgroup and the overall small number of male students in our sample. It remains subject to future research whether significant differences can be found in larger samples.

Fig. 5
figure 5

Moderating effect of treatment, separately by gender. Corresponding regressions are displayed in Table 4 (model 2 and 3)

Our robustness checks (ITT analysis and FE estimations, summarized in Tables A.2 and A.3) show that methodological choices play a minor role in our results. While the FE estimates rarely diverge from the results presented in this section, the ITT analyses vary in the expected way as the coefficients are smaller, although the significance levels are more or less unchanged. Moreover, panel attrition does not seem to induce confounding. While response rates are slightly higher in the treatment group (though not significant at the 5% level; Table A.4, Model 1), a multivariate check on covariate imbalance among individuals who participated in the second survey wave shows that none of the covariates is related to treatment assignment (Model 2; the p-value of joint significance is 0.87). This is substantiated by selection regressions including interaction terms (Models 3 and 4), which show that the treatment and control groups follow very similar selection patterns. In sum, this implies that while we observed panel attrition, it did not induce confounding and does not, therefore, indicate biased treatment effect estimates. Unobserved confounding cannot be ruled out completely, but this seems unlikely in light of the results concerning the observed covariates.

Discussion and conclusion

This paper been inspired by consistent gendered differences in the choice of major in higher education, as well as previous research into interventions to mitigate these differences. We focused on seeking a theoretical explanation for gendered differences that would highlight the role of risk/return preferences. Subsequently, we have assessed the moderating impact of an intervention that does not focus on providing objective information about risks and returns but instead aims to strengthen students’ self-confidence in their abilities to mitigate the deterrent effect of higher failure rates. To answer these research questions, we employed a survey experiment combined with a field experiment.

Our results partly confirm and partly subvert our theoretical expectations. On one hand, female participants are indeed disproportionately deterred by higher failure rates, whereas male participants are disproportionately attracted by higher returns. Moreover, participation in the counselling workshop mitigate the deterrent effect of higher failure rates. On the other hand, and contrary to our expectations, the mitigating effect of the workshop is not stronger for girls, but, if at all, stronger for boys (but these differences are not statistically significant).

These results inform higher education research and indicate new directions for future research in three ways. First, and at the most abstract level, our results confirm the general notion outlined in rational choice approaches that utility considerations matter for educational choices. Moreover, the apparent gendered differences reveal that the evaluation of options depends heavily on the subjective evaluation of their risk–return patterns, implying that individual risk–return preferences matter even in cases of perfect information.

Second, our results add to the understanding of gender inequalities in educational decision-making in general, and particularly in higher education. As outlined in the introduction, the correlation between risk–return patterns and the proportion of male to female students in college majors is quite apparent at the aggregate level. However, causal claims about this relationship are hampered because the relevant study programmes differ in numerous other ways that may confound the relationship between risk–return patterns and student gender proportions. Our survey experiment, therefore, substantiates the argument that risk/return preferences might drive this relationship.

Finally, our results show that the importance of perceived risk and returns could be exploited by interventions designed to support students’ decisions. However, the results from the gender-specific analyses counter our theoretical expectations, as the counselling intervention exerted a stronger effect on boys than girls. There are multiple explanations for this unexpected result. On one hand, this may have resulted from a classical Matthew effect, in which people at the higher rather than the lower end of the risk-affinity distribution are pushed further upwards even heavier. On the other hand, these results also raise the question of whether we still lack a sufficient understanding of the mechanisms behind gender differences, as well as the heterogeneous effects of educational interventions on both genders. This argument is reinforced by the fact that we are not the first to observe effect heterogeneity that favours boys when stronger effects on girls were expected. For example, see other researchers’ results on the provision of objective information about risk and returns (Barone et al., 2019; Finger et al., 2020; Peters et al. 2023), curricular demands (Görlitz & Gravert, 2018; Jacob et al., 2020) or gendered responses to failure in “weed out courses” (Sanabria & Penner, 2017). In this regard, pursuing a deeper understanding of the gender-specific mechanisms of study choice, as well as the effect of educational interventions, should remain an important part of the higher education research agenda.

Despite its contributions to the field, our research design is not without limitations. First, offering the counselling workshop under real conditions is both an advantage and a disadvantage, as the findings concerning the importance of risk–return preferences do not necessarily generalize to the population of all students but rather to those who participate in similar interventions. (For further considerations of participation rates, see Pietrzyk & Erdmann, 2020.) This reaffirms the external validity of the policy conclusions drawn concerning the moderating effect of the intervention. Moreover, our results align with existing research into gendered differences in risk and return preferences that relied on representative samples (Sanabria & Penner, 2017), suggesting that the observed differences in risk–return preferences are not merely due to sample selection. Second, we investigated intended rather than actual study choice. While the results presented by Buser et al. (2014) imply that intended choice may translate into actual study choice, future research must determine whether similar interventions actually lead to different study choices. Third, and relatedly, as we wanted students to evaluate different study programmes for their preferred major, it remains to be seen whether students would change their preferred major if their risk and return preferences changed. While we asked students to compare the same majors at different universities, it seems reasonable to assume that the effect would translate to comparisons of different majors. Finally, the notion of encouraging students to engage in more challenging and rewarding educational paths is often regarded positively in education research (Lent et al., 2018) as it may guide pupils to pursue study choices based on interest rather than anxiety. However, this does not guarantee their mastery of the resulting challenges. Therefore, future research should investigate whether the possible effect on study choice correlates with higher success and satisfaction during tertiary education.

In sum, our results reinforce the importance of both actual risk–return patterns and subjective perceptions, while also highlighting potential avenues for future research regarding long-term consequences and possible policy options.