Introduction

The natural sciences are generally perceived as predominantly male. School textbooks and media often picture scientist as stereotypically male (e.g., Cheryan et al. 2009; Good et al. 2010). Similarly, gender stereotypes present a clear association of mathematics and physics as male domains (e.g., Makarova, Aeschlimann, and Herzog, 2019; Nosek et al. 2002).

For many years, research focused on the causes for this female underrepresentation and underachievement in multiple science domains. Stereotype threat theory presents an approach to explain this phenomenon, linking female underrepresentation and underachievement in science to females’ minority status in the predominantly male environment and connecting it to negative stereotypes about females in science (e.g., Hall et al. 2015; Steele and Aronson, 1995; Steele et al. 2002). The theory describes females that are perceiving stereotypes are, in consequence, experiencing, among other things, doubt in their belonging within the science group and then unconsciously perform as the negative stereotypes suggest — that females do not have the same abilities as men for science and thus cannot perform equally well.

Stereotype threat theory was successfully used to explain underrepresentation in several educational contexts (school: Bedyńska, Krejtz, and Sedek, 2018; Hermann and Vollmeyer, 2017; college: Schmader, 2002; Good et al. 2012; science competitions: Ladewig et al. 2020; Steegh et al. 2019) and, even more so, to apply to other minorities that face various negative stereotypes (e.g., Shih et al. 1999; Weber et al. 2015).

One science environment, which is faced with problems of female underrepresentation, are extracurricular science competitions. Students in Germany, who are interested in science and want to pursue their science interest outside of their normal school environment, have the possibility to participate in the Science Olympiads, which bundle competitions in several science disciplines. Although the competitions are very popular, gender differences persist especially in one of these science competitions — the Physics Olympiad. Here, female participants are not just underrepresented but also tend to drop out of the competition earlier than the male participants. It thus appears to be in question, whether the competition in fact supports the long-term interest of female participants or rather lowers their beginning science interest.

In the current study, we therefore connect stereotype threat theory with the expectancy-value model (e.g., Eccles, 2009; Eccles et al. 1983) to model the experiences during the course of the participation in the Physics Olympiad and their impact on future career expectations. Participation in this study included participation in a weekend seminar, which was attached to the Physics Olympiad. Impact of stereotypes, gender identification, and sense of belonging on success expectations for and value of choosing to pursue a career in physics were looked at over four measurement points.

Theoretical background

Female underrepresentation exists in many science fields. Only 22.2% of all academics in Germany in science, technology, engineering, and mathematics (STEM) are female and, even fewer, females only take 15.1% of all STEM jobs (Anger et al. 2019). This schema persists throughout the stages of science pursuit: It begins with lower interest of female than male students for school science (Sadler et al. 2012; Schorr, 2019), leading to lower numbers of women opting for science in elective courses, and subsequently to more men choosing to major in science in college and ends with male predominance in science careers (e.g., Kahn and Ginther, 2015; Miller and Wai, 2015; Su and Rounds, 2015). In addition to lower participation in science, females suffer under further factors which facilitate the persistence of gender difference. Men show higher ability self-concepts in science (e.g., Saß and Kampa, 2019; Watson et al. 2019) and higher self-esteem (e.g., Schmader et al. 2004). Even more so, women’s mathematical performance suffers increasingly the more males are present (Inzlicht and Ben-Zeev, 2000).

Nevertheless, many female students are interested in science and pursue science environments outside of their everyday school environment. The Science Olympiads offer students such an environment and the opportunity to compete against other motivated and talented students in several science domains. Following the idea of supporting talented and interested science students, the Science Olympiads gained much attention and high participation numbers in Germany. Over 9000 students participated in 2019 in the German Physics, Biology, Chemistry, International, and European Junior Science Olympiads, as well as in the BundesUmweltWettbewerb, a contest for environmental projects. However, common gender differences, which are known from jobs or college, also show in the Science Olympiads, especially so in the German Physics Olympiad.

The German Physics Olympiad is a national contest, which asks students to excel at various physics experiments and tasks in four consecutive rounds to become one of five participants who continue as members of the German national team to the international contest. Any interested student can participate in the competition, if they are still in school and within the set age limit. Students can enroll online. The number of participants successively decreases with each round to let only the best contestants continue in the competition. The competition’s structure poses several problems with regard to female underrepresentation: Beginning with the registration for the competition, fewer female than male students choose to participate in the contest. In 2018, only roughly every fourth participant was female. The higher rate of females than males leaving the competition throughout the contest heightens this general underrepresentation of females even more. This often leads to all-male teams for the international contest, which was, for example, the case in 2019.

The contest is supposed to provide an environment, which supports females and males equally in their pursuit of science. As female participants seem to be at a disadvantage due to their underrepresentation from the beginning on, it is in question, whether females and males actually leave the contest with equal intentions of continuing in science. To evaluate the competition’s use and success in reaching its own goal of supporting interested students, it is essential to research if females are negatively impacted in their physics pursuit through participation in the German Physics Olympiad.

Stereotypes and stereotype threat in science environments

Physics creates an environment that rather encourages women to leave the domain or choose not to enter in the first place (see Cheryan et al. 2017; Diekman et al. 2010). Several theories focus on the mechanism behind this process. One of these theories, prototype matching, aptly explains the underlying mechanism: Persisting in an environment is suggested to be based on the perceived fit between the self and the stereotypical person, who is expected in the environment (e.g., Hannover and Kessels, 2004; Setterlund and Niedenthal, 1993). Achieving such a fit is seemingly more difficult in science as the prototypical student, who enjoys science, is described more negatively by other students than the prototypical student, who is not enjoying science (Hannover and Kessels, 2004). This fit also possibly counteracts females’ interest in continuing in science as the similarity to participants of a domain moderates interest of continuing there (Cheryan and Plaut, 2010). Finding a sufficient fit for continued interest to persist in science appears to be a higher hurdle to jump for females as they encounter a stereotypically male environment in science, especially so in physics. Even more so, cues in the science environment which promote the male stereotypes, e.g., in books or on posters, can further reduce females’ feeling of belonging in the setting (Cheryan et al. 2009) and thereby widen the gender gap even more. As the Physics Olympiad is predominantly male and thereby promoting the association of physics being rather male, it presents cues to female participants that might reduce their belonging and the benefits, which the participation in the contest is supposed to offer.

Research has set a focus on analyzing the role, which stereotypes play in causing female underrepresentation and the achievement gap in science. Stereotypes about women and girls in science mainly state that girls and women have less science talent or ability to succeed in the science domains than men and boys (e.g., Cohen and Garcia, 2008; Miller et al. 2015; Smyth and Nosek, 2015). A popular theory, which explains the hindering impact of such negative stereotypes, is the stereotype threat theory. The theory states that minority group members’ performance is inhibited due to negative stereotypes attributed to them by the majority group (e.g., Spencer et al. 1999; Steele et al. 2002): The minority group members perceive cues to stereotypes in the environment — it is thereby not important whether these are explicitly or implicitly broadcasted (e.g., Marchand and Taasoobshirazi, 2013; Spencer et al. 1999) — react with cognitive, motivational, and other changes and end up behaving accordingly to the stereotypes (e.g., Bedyńska et al. 2018). For women in science, this means showing lower abilities and achievements than in a situation without stereotypes. The theory was also shown to apply to other stereotyped groups such as ethnic minorities (Aronson et al. 2002; Froehlich et al. 2016), boys in certain academic settings (Hartley and Sutton, 2013), or men in typically female jobs (Kalokerinos et al. 2017).

The negative consequences of stereotype threat have a wide range and are not limited to reduced performance (e.g., Flore and Wicherts, 2015; Shih et al. 1999). Burnout (Hall et al. 2018), anxiety or arousal (Ben-Zeev et al. 2005) and mental exhaustion, and rejection (Hall et al. 2015) can be stereotype threat’s consequences just as well as feelings of incompetence (Schmader and Johns, 2003) or reduced development of abilities in the stereotyped domain (Appel et al. 2011). Consequently, several intervention methods were tested to reduce stereotype threat’s impact (Schmader, Hall, and Croft, 2015; Schmader and Hall, 2014).

A further factor, which is supporting the negative impact of stereotype threat, is gender identification. Females in predominantly male science environments identify more easily with their gender identity (Marx et al. 2005), which in physics environments and extracurricular physics competitions is the stereotyped group of females. This higher identification, however, hinders good performance in these domains for women: Schmader (2002) showed that women, who reported high gender identification, were not able to solve mathematics tasks as good as males; whereas women, who reported lower gender identification, performed just as well as the men. Gender identification thus is not just a factor in stereotype threat but also a heightening factor for the achievement gap (e.g., Flore and Wicherts, 2015; Shih et al. 1999). How the process of participation in a stereotyping and predominantly male science competition affects the gender identification and perception of stereotypes of female participants has, however, yet to be assessed and modeled.

In summary, science presents a multifaceted, disadvantaging environment for females. Male predominance, negative stereotypes, and higher gender identification interact towards underrepresentation of women and girls. This is even more problematic, when women and girls believe in the negative stereotypes’ legitimation, as the stereotypes’ negative impact is then even higher (Schmader et al. 2004).

Sense of belonging in stereotyping science environments

How can women and girls be encouraged to continue in science and not feel excluded by the male majority group? Sense of belonging is one important factor in this situation, as it describes feelings of trust, positive affect, membership, and participation, as well as acceptance by the group (Good et al. 2012).

Sense of belonging interacts with several relevant variables in educational contexts. Gillen-O’Neel and Fuligni (2013) showed sense of belonging’s close connection to persistent academic engagement, which Pittman and Richmond (2007) underlined by showing that sense of belonging predicts academic achievements. Freeman et al. (2007) point to sense of belonging’s further association with motivation, self-efficacy, and task value, as well as social acceptance in college. They also showed that characteristics of the class and the instructor influence sense of belonging (Freeman et al. 2007). Consequently, it was suggested to intervene by heightening minority groups’ sense of belonging in problematic environments to increase identification with the domain (Osborne and Jones, 2011). This, indeed, could be recommendable for females in science as women were shown to have lower sense of belonging and higher interest in leaving science environments, which were perceived as predominantly male (Murphy et al. 2007).

Sense of belonging was, nevertheless, shown to be affected by stereotype threat (e.g., Ladewig et al. 2020). Among the various other negative consequences of stereotype threat, it can also cause uncertainty about one’s belonging within the environment in which the stereotype threat is occurring (Walton & Cohen, 2007). Females, who are negatively stereotyped in the encountered science environment and — due to stereotype threat — perform worse than the male majority group members, are likely to doubt their abilities and interpret this as lower suitability within the environment (Good et al. 2003). Consequently, female students feel less sense of belonging in mathematics, if they perceive the environment to promote gender stereotypes (Good et al., 2012).

To encourage more females not to leave science situations or to go into those environments in the first place, the stereotypes — as they are promoted and perceived by the participants in the situation — as well as participants’ sense of belonging should be carefully analyzed. Both have the ability to either hinder or encourage female underrepresentation. It is therefore especially important to analyze these in science situations, which students voluntarily enter, as the students in these environments have taken the first step towards an out-of-school science engagement. Such an environment is the Physics Olympiad.

Expectancy-value model

To study the relation of sense of belonging and stereotype threat to female underrepresentation in science, career choices should be closer looked at. Determining how participation in a science competition interacts with the decision to stay or leave a domain is thereby especially of interest. A model, which has been shown to explain how achievement-related choices are formed, such as choosing to study a subject or continuing in a domain, is the expectancy-value model (e.g., Eccles, 2009; Eccles et al. 1983). The model includes success expectations for and value of the choice as important predictors of making or declining the choice and was shown to apply in several educational contexts (see Watt, 2016) as well as in science competitions (e.g., Steegh et al. 2021). The link between belonging and important elements of the expectancy-value model was shown for different stations throughout the educational system: Goodenow (1993) showed students’ belonging in middle school to predict both success expectations and value. Gillen-O’Neel and Fuligni (2013) found the association of higher intrinsic value of school in high school to higher belonging and Freeman et al. (2007) showed the close connection of belonging in college to task value. Also, Ladewig et al. (2020) showed sense of belonging to be a significant predictor of success expectations and value of choosing a career in physics for participants in the German Physics Olympiad. Nevertheless, the authors also found that stereotype endorsement negatively impacted female participants’ belonging, thereby indicating a stereotype threat effect. This negative impact of stereotypes also reaches task value (Plante et al. 2013) and belonging to the workplace (Rahn et al. 2020).

However, the continued changes in sense of belonging, gender identification, and stereotype threat during the participation in the German Physics Olympiad and their impact on success expectation and value for continuing in science after the contest have not been researched. This would be necessary to gain better insights into the proceedings of the contest and consequences of the participation, especially with regard to the goal of providing equitable support for every participant’s physics interest and pursuit.

The current study

In the current study, we invited participants in the German Physics Olympiad to participate in weekend seminars in addition to their participation in the first contest round to gain further physics knowledge. Participants partook in the study over approximately 4 months and filled in four questionnaires to assess their belonging, gender identification, and perceptions of stereotypes as well as intentions to continue in science. The first two contest rounds are consisting of homework, and only after achieving participation in the third round will participants typically meet other contestants in seminars. In our study, participants encountered other contestants for the first time at the study’s weekend seminars as they had yet only participated in the first round.

Participants filled in questionnaires at home, then twice at the seminar and, lastly, several weeks after the seminar at home again. The study combined all possible environments, which participation in the German Physics Olympiad can present, in a shorter amount of time. It thus depicted a whole participation process in the Physics Olympiad, which to our knowledge, no previous research did.

We posed the following hypotheses with regard to stereotype threat theory and expectancy-value theory during the 4 months of participation:

First, we want to evaluate the competitions own goal. Accordingly, we hypothesize that the German Physics Olympiad is fulfilling its intention of providing a supporting environment to encourage students in their pursuit of physics.

Hypothesis 1: We hypothesize that male and female participants have equal intentions of continuing in science several weeks after the weekend seminar, showing in similar success expectations for and value of choosing a career in science.

However, the contest presents a typical predominantly male science environment and we thus expect the Physics Olympiad to be affected by stereotype threat and social identity threat as suggested by the literature. Stereotype endorsement and perceived social identity threat should negatively impact females during their participation in the contest. They should, consequently, have heightened gender identification and lowered belonging caused by the threat effects.

Hypothesis 2: We hypothesize that stereotype endorsement negatively impacts female participants’ sense of belonging, while stereotype endorsement and perceived social identity threat are hypothesized to heighten female participants’ gender identification. These effects should not be detectable for male participants.

With regard to the stereotype threat and social identity threat, changes on the variables, which are expected to be impacted by participation in the contest — stereotype endorsement, perceived social identity threat, gender identification, and belonging — should impact the participants’ success expectations for and value of choosing a career in science.

Hypothesis 3: We hypothesize that male and female participants’ sense of belonging and gender identification as well as perceived social identity threat and stereotype endorsement predict success expectations for and value of choosing a career in science.

Method

The current study was part of the project “Identiphy - Identity development in physics!”. Over 2 years, two consecutive cohorts of German Physics Olympiad contestants could voluntarily participate in the project in addition to their participation in the contest. Contestants were informed that declining participation in the project did not cause any disadvantages for the competition.

The project consisted of weekend seminars, with a focus on teaching physics contents in a group of interested and talented students. The weekend seminars were placed in the contest proceedings so that students had filled in the first contest round but not yet received information on their results. The contestants met other contestants for the first time at the seminars as the first round solely consists of experimenting and solving tasks at home. We assessed students at four points in time (previous to the seminars, at the beginning and end of the seminars, and several weeks after the seminars).

Participants

Contestants of the German Physics Olympiad received an invitation with information of the project via letter or e-mail. Participants filled in questionnaires anonymously and gained no disadvantages when deciding against participation. Contestants were allowed to participate after providing informed consent either personally or, if the participants were underaged, by their legal guardian.

In the first cohort, 167 students were partaking (age: M = 15.87, SD = 1.26), of which 125 assessed male gender (age: M = 15.81, SD = 1.34) and 42 female gender (age: M = 16.07, SD = 1.00). The second cohort was smaller with only 131 participants (age: M = 15.95, SD = 1.40). Of these, 40 indicated that they described themselves as female (age: M = 15.65, SD = 1.41) and 91 as male (age: M = 16.09, SD = 1.38). Overall, 298 students chose to participate.

The cohorts did not prove significantly different in age (t (264) =  − 0.51, p = 0.609) or gender (χ2 (1) = 1.13, p = 0.287). As participants also ran through identical questionnaires, we deferred from further separating the cohorts.

Procedure

Participants filled in the first questionnaire after registration and giving informed consent. The questionnaire was provided online and included, among others, scales on gender identification, sense of belonging, perceived social identity threat, and stereotype endorsement. The scales referenced to the physics environment, which students knew from their regular school life. Next, participants received preparatory materials for the seminar contents.

The second questionnaire which included the same scales as the first one was filled in at the beginning of the weekend seminar. The information material informed all participants before registration that the participation was bound to a research project. The weekend seminar proceeded with physics contents in the form of theoretical and experimental tasks before ending with the third questionnaire. Again, the questionnaire included the same scales. Lastly, students had the opportunity to continue approximately 6 and 12 weeks after the seminar with further physics tasks, which they could hand in and get back corrected. At the end of this phase, the last and fourth questionnaire concluded the project. The questionnaire used the same scales as the previous ones as well as scales on success expectations for and value of choosing a career in physics, while also including a reminiscence of the seminar with photos and questions about participants’ experiences there.

Measures

Sense of belonging

We used Good et al.’s (2012) Math Sense of Belonging scale in an adapted form to physics with 30 items to measure sense of belonging. The items were assessed from 1 “strongly disagree” to 5 “strongly agree.” The sub-scales negative affect and desire to fade were reverse coded. We used the overall scale as was recommended by the authors of the original scale (Good et al. 2012). The questionnaires were adapted with different framing, to fit the four assessment points: The first and last questionnaire used the school physics classes as a reference group (e.g., “When I am in my physics lessons, I feel that I belong to the group”; first questionnaire: Cronbach’s alpha = 0.63; last questionnaire: Cronbach’s alpha = 0.94) and both questionnaires at the weekend seminar referenced the seminar group (e.g., second questionnaire: “At the moment, I feel that I belong to the seminar group”; Cronbach’s alpha = 0.91; third questionnaire: “During the weekend seminar, I feel that I belong to the group”; Cronbach’s alpha = 0.91).

Gender identification

Gender identification was measured using a scale developed by Schmader (2002). All four items (e.g., “Being a boy/girl is important for the perception I have of myself.”) were ranked from 1 “strongly disagree” to 5 “strongly agree.” The scale was included in all four questionnaires. Cronbach’s alpha ranged between 0.80 (first questionnaire), 0.83 (beginning of seminar), 0.83 (end of seminar), and 0.87 (last questionnaire).

Perceived social identity threat

Perceived social identity threat was measured by means of an adapted version of a four-item scale by Rattan et al. (2018). Items (e.g., “My gender influences the perception that others have of my physics abilities.”) were assessed from 1 “strongly disagree” to 5 “strongly agree” (Cronbach’s alpha: 0.82 in the first questionnaire, 0.80 at the beginning, and 0.76 at the end of the seminar). A higher score indicates higher perceived threat. The scale was included at the first, second, and third assessment point.

Stereotype endorsement

Stereotype endorsement was measured using an adapted scale by Schmader et al. (2004) with three items. Items (e.g., “It is possible that man have more physics ability than do women.”; Cronbach’s alpha: 0.71 in the first questionnaire, 0.69 at the beginning, and 0.69 at the end of the seminar) were ranked from 1 “strongly disagree” to 5 “strongly agree.” Higher values present higher stereotype endorsement. The first, second, and third questionnaire included this scale.

Success expectation and value of choosing a career in science

Success expectations (e.g., “When I choose to study physics or take a job in physics, I believe that I will be successful.”) and value (e.g., “When I choose to study physics or take a job in physics, being successful is especially important to me.”) were both measured on 4 items from Lykkegard and Ulriksen (2016). Items were ranked from 1 “strongly disagree” to 5 “strongly agree.” Both scales were only rated at the last assessment point. Cronbach’s alpha was 0.64 for success expectation and 0.70 for value.

Results

The descriptive results show that male and female participants rated sense of belonging and gender identification at nearly all assessment points not significantly different. Significant differences only appear for sense of belonging at fourth assessment point [t (114.42) = 2.13, p = 0.036], where boys rated belonging higher than girls. The mean values for gender identification and sense of belonging can be found in Table 1.

Table 1 Means and standard deviations of sense of belonging and gender identification for all assessment points split by gender and intervention method

The mean values for perceived social identity threat and stereotype endorsement can be found in Table 2. Male and female participants assessed all assessment points of perceived social identity threat significantly different. At first assessment point, girls rated the threat higher than boys [t (281) =  − 4.77, p < 0.001], which continued likewise at second assessment point [t (122.40) =  − 3.04, p = 0.003] as well as at third assessment point [t (128.75) =  − 2.77, p = 0.006]. Female and male participants rated stereotype endorsement significantly different at first [t (167.03) = 2.39, p = 0.006] and last assessment point [t (139.21) = 2.23, p = 0.027], each time with higher values by the male participants than the females.

Table 2 Means and standard deviations of perceived social identity threat and stereotype endorsement, and perceptions of environmental stereotyping for the first three assessment points split by gender

Correlational findings are presented in the supplementary information.

Success expectations for and value of choosing a career in science

To test Hypothesis 1, t-tests were calculated. The scales success expectations [Mmale = 3.85, SDmale = 0.64; Mfemale = 3.76, SDfemale = 0.63] and value [Mmale = 4.20, SDmale = 0.58; Mfemale = 4.33, SDfemale = 0.60] of choosing to continue in science were compared for male and female participants’ intentions of continuing in science. No effect of gender on success expectations, t (103) = 0.85, p = 0.396, or value, t (99) =  − 1.31, p = 0.192, was found. This is in line with Hypothesis 1.

Model of stereotype threat and social identity threat

To analyze the gender differences postulated in Hypothesis 2 and Hypothesis 3, two multi-group structural equation models (SEM) were estimated for male and female participants. First, a SEM with freely estimated paths was calculated. Second, a SEM which constrained the paths to be equal across the two groups was calculated. The models were compared regarding their fit to see if the gender-specific (the first model with freely estimated paths) or gender-invariant model (the second model with constrained paths) fitted the data better.

The results of a model with gender-specific coefficients indicate a good fit with χ2 (df = 130, N = 306) = 179.16, p = 0.003, CFI = 0.97, TLI = 0.94, RMSEA = 0.05 (90% confidence interval = 0.03, 0.07), and SRMR = 0.07. The second SEM with equal paths in both groups shows an acceptable to good fit of χ2 (df = 165, N = 306) = 217.92, p = 0.004, CFI = 0.96, TLI = 0.95, RMSEA = 0.05 (90% confidence interval = 0.03, 0.06), and SRMR = 0.08 for modeling without splitting by gender (see Hu and Bentler, 1999). As the gender-specific, freely estimated model fit the data equally well as the gender-invariant model with constrained paths, we conclude that no gender differences between male and female participants were found. This, however, contradicts Hypothesis 2.

In consequence, we did not use the gender-specific groups any further. Instead, we continued with a third structural equation model. This model consists of one group, which included both genders, and estimated paths freely. The fit of this model was χ2 (df = 65, N = 306) = 78.78, p = 0.117, CFI = 0.99, TLI = 0.98, RMSEA = 0.03 (90% confidence interval = 0.00, 0.05), and SRMR = 0.05. This also represents a good fit (see Hu and Bentler, 1999). The model is shown in Fig. 1.

Fig. 1
figure 1

Structural equation model of the impact of stereotype threat and social identity threat on gender identification and sense of belonging for all participants. Continuous lines show p < .05

Regarding Hypothesis 2, the results are not supporting the assumed stereotype or social identity threat effects. Stereotype endorsement was not at any point significantly impacting belonging (second assessment point: β = 0.05, p = 0.415; third assessment point: β =  − 0.05, p = 0.300; fourth assessment point: β =  − 0.10, p = 0.140). This contradicts the assumed hypothesis of a stereotype threat effect on sense of belonging. Also, stereotype endorsement significantly impacted second assessment point’s gender identification (β = 0.12, p = 0.013) but not third (β = 0.06, p = 0.106) or fourth assessment point’s gender identification (β =  − 0.03, p = 0.513). As perceived social identity threat did not significantly impact any assessment point’s gender identification (second assessment point: β = 0.04, p = 0.368; third assessment point: β =  − 0.03, p = 0.415; fourth assessment point: β = 0.08, p = 0.117), the results contradict the assumed social identity threat and stereotype threat effects on gender identification.

Neither perceived social identity threat (value: β = 0.02, p = 0.842; success expectations: β = 0.07, p = 0.381) nor stereotype endorsement (value: β =  − 0.09, p = 0.259; success expectations: β = 0.02, p = 0.754) significantly predicted value or success expectations. This partly contradicts Hypothesis 3.

Lastly, the model shows that the previous point’s assessment sense of belonging significantly predicted next point’s sense of belonging throughout the study (first to second assessment: β = 0.17, p = 0.004; second to third: β = 0.53, p < 0.001; third to fourth: β = 0.36, p < 0.001). Also, belonging significantly predicted success expectations for a career in science (β = 0.26, p < 0.001). This is in line with Hypothesis 3. Similar effects show for gender identification: Participants’ assessments of the previous assessment point significantly predicted the next assessment (first to second assessment: β = 0.61, p < 0.001; second to third: β = 0.58, p < 0.001; third to fourth: β = 0.36, p < 0.001). Also, third assessment point’s gender identification significantly predicted success expectations (β =  − 0.14, p = 0.046). This confirms Hypothesis 3.

Discussion

Is participation in extracurricular science competitions equally beneficial for the pursuit of the domain for girls and boys?

To research this question, the present study analyzed how stereotype threat and social identity threat are experienced during participation in the German Physics Olympiad and their impact on the two main variables of Eccles’ expectancy-value model, success expectations and value. We used structural equation modeling to assess the changes in sense of belonging and gender identification over 4 months of participation in the German Physics Olympiad, ending with the assessment of success expectations for and value of choosing a career in science.

Our results indicate that at the end of the study, female and male participants were equally well supported within the contest in their interest and pursuit of physics, which we operationalized with success expectations for and value of choosing to continue in science.

Although the descriptive data show differences, which indicate that female participants endorsed stereotypes less and perceived more social identity threat than male participants, the hypothesized stereotype threat and social identity threat effects, which were assumed to impact sense of belonging and gender identification and lower success expectations for and value of continuing in science, could not be corroborated. Persisting in science is apparently not hindered by gender stereotypes in physics or by the own belief of these stereotypes’ eligibility.

These results are interesting as they contradict previous research on stereotypes in Science Olympiads (see Steegh et al. 2021; Ladewig et al. 2020). Whereas it was previously assumed that negative gender stereotypes about their abilities in physics impacted female participants in Science Olympiads, our results do not replicate this finding. We found no overall gender differences and the female participants in this study apparently had equal success expectations for and value of choosing a career in physics as their male counterparts. Still, higher gender identification was shown to lower success expectations for persisting in science whereas higher sense of belonging strengthened these success expectations. These two variables apparently directly interact with the outcomes of the expectancy-value model and the decision of continuing in science.

Equal success expectations for and value of staying in physics — but then no action to do so?

The results of this study show that female and male participants have equal success expectations for and value of choosing a career in physics. Why, then, are in the end still less girls participating in the Physics Olympiad and, later on, studying physics or choosing a career therein?

Stereotype threat and social identity threat, which were functioning as useful explanatory models for the choice of females against physics (Schmader and Hall, 2014; Schmader et al. 2015), did not impact sense of belonging or gender identification in physics for the female participants in this study in a way, which lowered their success expectations of continuing in science. Why is this so? Are stereotype threat and social identity threat actually not impacting the participants in the German Physics Olympiad?

For one, the present study was an additional program to participation in the Physics Olympiad. Spending time with a weekend seminar full of physics, most likely led to a sample, which was especially interested and talented (see Höffler et al. 2019). The participants were more interested to spend more time with learning physics and doing experiments as their fellow contestants. The female participants are thus, most likely, even more resistant to cues, which indicate low fit of females in science and encourage them to leave the environment than the other participants, as they chose to participate in the seminars. This could cause less susceptibility to stereotype threat and social identity threat.

Also, the seminars took place between the first round of the contest and the announcement of the results. This is especially relevant, when seeing that the female participants in our study showed low susceptibility to social identity threat and stereotype threat. One of the main problems of the competition is that female participants drop out of the contest in higher numbers than the male participants. In the first analyzed cohort, 23% of the participants were female in the first round, which continued with 23% of all participants in the second round. Afterwards, the typical Physics Olympiad problem began to show: Only 10% of the participants in the third round were female and none continued further in the competition. In the second analyzed cohort, the effect was even more prominent: Beginning with 24% of all participants being female in the first round, the decline of female participants showed in the second round already. Here, only 16% of all participants were female, continuing with 6% of all participants being female in the third round, and none succeeding further in the competition. With regard to this, participants in our study possibly chose participation in the seminars to prepare for the second contest round. Thus, the participants might present a subgroup of the overall German Physics Olympiad sample as the choice to participate in this study would imply higher success expectations in the contest or higher ability self-concept as it was yet unknown if the participants had continued to the next contest round. Higher ability self-concept or success expectations might thus possibly be further factors counteracting female underrepresentation in the competition and susceptibility to social identity or stereotype threat.

Limitations

Since previous research showed that stereotype threat effects were higher for girls and women who agreed more to gender stereotypes about females’ lower abilities in science (e.g., Pennington et al. 2016; Schmader et al. 2004), we operationalized stereotype threat with stereotype endorsement. This also gave the opportunity to include the scale into the normal questionnaires as an explicit measure. Implicit measurements of stereotype endorsement, which eliminate possible biases such as social desirability, would have drawn participants’ attention to the purpose of the measure, because these measurements would have to be separated from the normal study proceedings. However, measuring stereotype endorsement implicitly might not have led to other results as previous research is inconsistent on which method is to be preferred (see, e.g., Kessels et al. 2006). Nevertheless, participants rated stereotype endorsement equally high throughout the study, so we expect no bias of the results by this. Future research should still include an implicit measure to compare the effects of the measurement techniques.

Also, two limitations regarding the assessments need to be considered. First, participants assessed the scales with regard to slightly varying physics environments. Whereas we asked participants in the first questionnaire for experiences in their physics lessons at school, the second and third questionnaire explicitly stated the seminars, and the last questionnaire used again the physics class as the reference group. We do not see this as problematic because the participation in the Physics Olympiad presents a new environment of science to the participants in which they form a new perception of what physics is. The questionnaires represented this with varying reference groups. This should also counteract the possible second limitation of the assessments: Participants repeatedly filled in similar questionnaires in a 4-month period. However, all questionnaires were filled in with several weeks distance in between and different reference groups, so that no effects of the repeated measurements are expected. However, future research should include measures of the participants’ perceptions of what the physics environment is, which they feel belonging in and identify with.

Lastly, participants chose to be in this study voluntarily. Registrations were open to all participants in the German Physics Olympiad. Not all participants chose to join in the study, which led to a selective sample. We still assume that the participants represent the whole group of German Physics Olympiad participants. If anything, the sample of contestants, who opted for participation in the study, showed higher interest and engagement in the Physics Olympiad by using further offers to build physics knowledge. This is underlined by the numbers of female contestants: Only 82 of the two cohorts of Physics Olympiads’ overall 539 females chose to participate in this study. The sample thus remains a group of students, who are likely to pursue a career in physics. Our results show that participation in the study was not harmful to success expectations for and value of choosing a career in physics. Nevertheless, future research should also include comparisons of the participants’ experiences within the regular contest to those of participants within additional seminars to the contest. This would open up the analysis of possible subgroups of participants, such as different levels of interest or engagement in the competition. These subgroups could be a potential factor leading to female underrepresentation due to their different susceptibility to social identity or stereotype threat. Also, analyzing the subgroups could point to new starting points or show improved ways of intervening against female underrepresentation in the Physics Olympiad.

Implications and conclusion

This study showed that students, who chose to participate in the Physics Olympiad, were equally well supported in their persistence in physics in the competition regardless of their gender. As previous research showed harmful consequences of stereotype threat and social identity threat on important factors for continuing in science, our results leave a positive impression of the Physics Olympiad as a way to present physics in a lastingly beneficial way. Girls, who chose to participate in the contest, were not affected by gender stereotypes and social identity threat in a way that led to lower success expectations for and value of continuing in physics than their male counterparts.

However, it still remains unclear why regardless of these results, less girls than boys choose to participate in science both in extracurricular activities and academic settings. To obtain deeper insights into the reasons for participation or leaving of the contest, participants’ characteristics should be more closely looked at. Especially studying differences in competence (e.g., Schorr, 2019), self-concept (e.g., Saß and Kampa, 2019; Vinni-Laakso et al., 2019), and parental support (e.g., Hoferichter and Raufelder, 2019; Schorr, 2019) seem to be useful as these variables previously showed high impact on gender differences in science. Also, task characteristics (see Wille et al. 2018; Sanchis-Segura et al. 2018; Wheeler and Blanchard, 2019) should be studied to obtain deeper insights into the causes for female underrepresentation in the contest. Lastly, researching the impact of these variables on gender identification and sense of belonging — the two factors, which in this study proved to impact success expectations and value of continuing in physics — could lead to new ways to fight female underrepresentation in science.

Concluding, the results draw a positive picture of the Physics Olympiad as an extracurricular science competition, which does not discriminate the highly interested female participants in our study. Participation in the contest does not negatively impact these participants’ persistence in science. Also, we did not find any differences between females’ and males’ success expectations for and value of choosing a career in physics. However, the underrepresentation of female participants in the contest persists. These results suggest that the Physics Olympiad is providing an equitably supporting environment for students, who are highly interested in science, regardless of their gender. Analyzing the possible subgroups of participants to their different experiences of the contest seems recommendable for future research to confront the ongoing underrepresentation of female participants in the German Physics Olympiad.