Introduction

Science, Technology, Engineering, and Mathematics (STEM) skills play a vital role in promoting innovation, economic growth, and societal well-being (Sellami et al., 2023; Vulperhorst et al., 2018). Despite the growing importance of STEM education, there has been a decline in the number of students graduating from STEM disciplines (Ahmad et al., 2023; Dagley et al., 2016). One of the major causes of this decline is the lack of interest among students in STEM courses (Lytle & Shin, 2020). Hence, it is essential to comprehend the factors contributing to students’ disinterest in STEM disciplines (Maltese et al., 2014), using a validated scale encompassing all potential dimensions of interest. This can aid in understanding the underlying reasons behind their lack of interest in STEM disciplines.

While various STEM scales exist, they lack comprehensiveness and fail to capture the multifaceted nature of STEM interest (refer to Table 1). Existing scales focus on specific domains and overlook holistic aspects of STEM interest (Staus et al., 2020; Tyler-Wood et al., 2010; Unfried et al., 2015). The existing scales claiming to measure STEM interest are designed for students and measure interest primarily from a career perspective. However, there is a noticeable absence of such scales tailored explicitly for students that consider psychological aspects of interest. In order to better understand and assess interest in the fields of science, technology, engineering, and mathematics (STEM), there is a need for a comprehensive and unified tool that can effectively measure STEM interest across various dimensions. The present tool would enable educators, researchers, and policymakers to gain valuable insights into the factors that influence STEM interest and help foster greater engagement in these critical fields. The present study addresses the gap by developing and validating a robust STEM Interest Scale. The proposed scale encompasses various dimensions of STEM interest, which have been derived from self-determination theory, social cognitive theory, social cognitive career theory, and person-centered theory. By developing a comprehensive STEM Interest Scale, this study seeks to facilitate the identification of patterns, trends, and potential areas for intervention to enhance STEM engagement. Ultimately, the validated scale can contribute to promoting sustained interest and participation of students in STEM fields.

Table 1 An overview of various measures for STEM published in literature during the last decade and their descriptions

Literature Review

Interest plays a vital role in an individual’s affective functioning with respect to learning (Bandura, 1997). It is a multifaceted and complex motivational construct (Dierks et al., 2016; Hazari et al., 2017). In the present context, STEM Interest is operationalized as an individual's motivation, determination, self-assurance, and sense of self-belonging in their engagement within STEM fields. These factors significantly influence students’ persistence (Allen & Robbins, 2010), attention (Hidi & Renninger, 2006), and enrollment in higher education STEM programs (Mohd Shahali et al., 2019; Vulperhorst et al., 2018), emphasizing the need for a valid and reliable instrument to measure it effectively. However, the existing literature (Table 1) on STEM interest reveals a scarcity of studies with a comprehensive, multifaceted scale, leading to a gap in the literature.

The multifaceted nature of an interest in STEM can be well predicted based on motivational theories (Murphy et al., 2019). The theoretical framework of the present study is anchored on four main theories that explain the factors influencing students’ interest in STEM disciplines: self-determination theory, social cognitive theory, person-centered theory, and social cognitive career theory. These theories offer a comprehensive framework for analyzing STEM interest as a multidimensional concept that includes self-efficacy, self-concept, intrinsic motivation, and employment aspiration. According to self-determination theory, intrinsic motivation is essential for maintaining sustained interest in STEM subjects (Deci & Ryan, 1985). The social cognitive theory states that self-efficacy, or students’ belief in their ability to perform well in STEM tasks, influences their engagement, effort, persistence, and achievement outcomes (Bandura, 1977, 1986). Person-centered theory claims that self-concept, or students’ perception of themselves within the STEM context, shapes their interest and outcomes associated with STEM (Rogers, 1957). Finally, social cognitive career theory suggests that employment aspiration, or students’ career-related goals, also significantly affects STEM interest and enrollment (Lent et al., 1994). The following sections will elaborate on the theories and factors that have been proposed.

Self-determination theory, a widely recognized psychological theory, emphasizes the significance of intrinsic motivation in sustaining interest and engagement in STEM subjects (Deci & Ryan, 1985). Intrinsic motivation, which refers to an individual's inherent drive and enjoyment of a task or activity, is crucial in nurturing interest and passion in STEM domains (Hidi & Renninger, 2006; Zaharin et al., 2019). Research has shown that intrinsic motivation has a positive impact on academic success, as it is strongly associated with students' interest in STEM subjects and careers (Maltese & Tai, 2011; Wang & Degol, 2013). Therefore, fostering intrinsic motivation among students is essential for promoting their engagement and success in STEM fields.

Bandura’s Social Cognitive Theory proposes that a person’s self-efficacy plays a significant role in determining the level of effort that an individual puts forth to succeed in STEM-related tasks (Bandura, 1977; Schunk & DiBenedetto, 2021). In this context, self-efficacy refers to the belief that students hold about their abilities to perform well in STEM tasks. This belief is known to strongly influence their engagement, effort, persistence, and achievement outcomes (Franks & Capraro, 2019; Honicke et al., 2020). Therefore, it is important for educators to focus on building students’ self-efficacy in STEM subjects, as it is a crucial factor in developing their interest and motivation to succeed.

Carl Rogers, the renowned psychologist, proposed the person-centered theory, which suggests that a student’s self-concept within the STEM context is critical in shaping their interest and outcomes associated with STEM. Self-concept refers to a student’s perception of their abilities and competence in STEM and is a significant predictor of their interest in STEM fields. Research studies have shown that there is a strong relationship between self-concept and STEM outcomes, and it is an essential factor in determining students’ success in STEM education. Although self-efficacy and self-concept are related, they are distinct concepts that contribute uniquely to STEM interests. While self-efficacy is concerned with a student's belief in their ability to complete a task successfully, self-concept refers to their general perception of themselves in a particular domain, such as STEM. It is vital to note that both self-efficacy and self-concept are critical in increasing students’ interest in STEM and can play a significant role in shaping their future careers.

The Social Cognitive Career Theory (SCCT) proposed by Lent et al., in 1994 provides a framework for understanding how students’ employment aspirations directly influence their interest in STEM fields. This theory is essential in assessing students’ motivation, beliefs, and future aspirations within the STEM landscape. According to recent research, employment aspirations or career-related goals significantly shape students’ interest and enrollment in STEM fields (Ahmed & Mudrey, 2019; Luo et al., 2021). Notably, a student's employment aspiration is influenced by their interests, values, and available opportunities. Studies have shown that students with high employment aspirations tend to have greater motivation to pursue STEM professions (Maltese & Tai, 2010; Wang et al., 2017). Therefore, it is crucial to consider students’ career-related goals while designing STEM education programs to foster their interest and engagement in STEM fields. By doing so, we can help students develop a passion for STEM and prepare them for successful careers in the future.

Therefore, based on the insights from the above-mentioned theories, the study theorizes that STEM interest is inherently a multidimensional construct. The proposed framework acknowledges the interplay of intrinsic motivation, self-concept, self-efficacy, and employment aspiration in shaping students’ STEM interests. Therefore, by integrating these theories, the study aims to develop and validate a robust and comprehensive instrument for measuring STEM interest among undergraduate students. This premise prompts the formulation of the following Research Questions (RQ).

RQ 1 Are intrinsic motivation, self-efficacy, self-concept, and employment aspirations dimensions of the proposed STEM Interest Scale?

RQ 2: To what extent does a multi-dimensional “STEM Interest” scale demonstrate reliability and validity in measuring STEM interest among undergraduate students?

Methodology

The current study utilized well-established scale development methodologies following a structured approach established in the literature (Churchill, 1979; Hinkin, 1995). The scale validity process consists of three distinct stages.

Stage 1

First, we developed items to measure STEM interest using the PRISMA methodology that resulted in the generation of items (see Table S1). The items’ content validity was established through the Delphi study. First, the invitations were extended to ten experts, of whom eight agreed to participate. These subject experts were professors specializing in STEM education. The experts were informed that the scale would measure the STEM interests of undergraduates and were provided with the definition of STEM interests. They were asked to critically assess each statement for comprehensiveness, wording, and relevance using a 3-point scale, as suggested by (Zaichkowsky (1985). The draft items were presented to the experts, who individually scored each item as not representative (1), somewhat representative (2), or clearly representative (3), following the suggested guidelines (Lin & Hsieh, 2011).

Stage 2

The refinement stage of the scale employed exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) on the proposed scale. The data collected during phase I underwent exploratory factor analysis (EFA), while the data gathered in phase II involved performing confirmatory factor analysis (CFA) and nomological validity. This approach is consistent with the scale purification protocols established by Churchill (1979) and Hinkin (1995) and further supported by Schwarz (2011). Using a purposive sampling technique, we collected responses from undergraduates enrolled in STEM courses at University A (Table 2). Participants were thoroughly briefed on the study’s objectives and procedures, and their consent was obtained. The participation of students in the survey was voluntary, with confidentiality guaranteed for research purposes. The survey instrument used was approved by the University A Institutional Review Board (XX-IRB) with the approval number XX-IRB 1721-EA/22. The questionnaire was comprised of two sections: the first gathered demographic details such as gender, age, and ethnicity, while the second contained statements regarding STEM interests. The survey was administered in two phases via Google Forms.

Table 2 Demographic details of undergraduate participants considered for STEM Interest (N = 546)

A purposive sampling technique was used to collect the data. The data were run through SPSS version 29 for exploratory factor analysis using varimax rotation and then the extracted factors were confirmed using confirmatory factors analysis through AMOS version 29 to ensure their robustness.

Stage 3

After the confirmation of the factors, we assessed the scale’s reliability and validity by using various criteria such as Cronbach’s alpha, composite reliability (CR), average variance extracted (AVE), common method bias, and nomological validity. The comprehensive evaluation ensures the scale's suitability for capturing STEM interest effectively.

Results

Stage 1: Item Generation and Selection

First, a systematic literature review was conducted using the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) Methodology (Fig. 1), with data sourced exclusively from SCOPUS. Our search query was structured as follows: ((“STEM Interest” OR “Interest in STEM”) AND (“Engagement” OR “Motivation”) AND (“scale” OR “measurement” OR “instrument” OR “questionnaire” OR “survey”)). The search included only English-language articles where the search query was found in the title, abstract, or keywords. This review resulted in the selection of 30 articles to create STEM interest items. Following the deductive approach as proposed by Hinkin (1995), an initial pool of 25 items was generated for the proposed measure in the Likert Scale. These items were evaluated using a 5-point Likert scale ranging from 5—strongly agree to -strongly disagree (Nemoto & Beglar, 2014).

Fig. 1
figure 1

PRISMA flowchart of the literature search and selection process for identifying studies on students' interest in STEM

During the preliminary screening, experts recommended removing and modifying certain items due to issues such as item overlap and ambiguous formulation. Only items rated clearly or somewhat representative were retained. Consequently, based on their feedback, some items were revised, and some were eliminated. This resulted in a set of 23 items out of the original 25.

The revised scale with 23 items was then resubmitted to the experts for further evaluation. They were asked to rate the items based on their relevancy, indicating whether they were not relevant, somewhat relevant, or highly relevant. Subsequently, the content validity ratio (CVR) was calculated for each item using the formula suggested by Veneziano and Hooper (1997) as given in Eq. (1).

$$\text{CVR}=\left(Ne-\frac{N}{2}\right)\div \left(\frac{N}{2}\right)$$
(1)

where, Ne = Number of experts rated highly relevant, and N = Total Number of experts participating in content validity. The content validity ratio was calculated, resulting in the deletion of 3 items with a CVR value of less than 0.75 removed from the scale. As a result, the scale was now comprised of 20 items that underwent the next stage of scale refinement.

Stage 2: Refinement of Scale

In phase I, 800 questionnaires were disseminated, yielding 297 responses. Following scrutiny for seriousness and completeness, 17 questionnaires were excluded, resulting in 280 valid responses and a response rate of 35%. Phase II of data collection involved the distribution of 500 questionnaires, with 275 returned. After excluding nine questionnaires with non-serious responses, 266 responses were considered suitable, achieving a response rate of 53.6%. Adhering to the recommendations of Hair et al., (2010), our dataset maintained a 3:1 ratio of cases to variables, ensuring its appropriateness to proceed with factor analysis.

Exploratory Factor Analysis (EFA)

The data collected on a 20-item scale underwent factor analysis. Before employing EFA first, the appropriateness of data was evaluated using the Kaiser–Meyer–Olkin (KMO) coefficient and Bartlett’s sphericity test. The KMO value exceeded the specified threshold, indicating adequate sampling adequacy (Table 3). Employing Varimax rotation and principal components analysis, the exploratory factor analysis achieved convergence after 25 iterations, explaining 55.7% of the total variance. Four factors with eigenvalues exceeding 1.0 were identified, as detailed in Table 4. By following the established criteria, items with factor loadings below 0.40 were eliminated, resulting in the removal of five items to ensure scale robustness. The overall commonalities were determined to be moderate to strong, aligning with previous research. The study maintained a four-factor solution, categorizing statements under the dimensions of intrinsic motivation, self-concept, self-efficacy, and employment aspiration.

Table 3 Kaiser–Meyer–Olkin test for sample adequacy of STEM Interest Scale for undergraduates
Table 4 Summary of items, factor loading, and factors that emerged in exploratory factor analysis of the STEM Interest Scale (SIS)

Confirmatory Factor Analysis (CFA)

CFA was performed on the STEM Interest Scale using data collected during phase II. Initially, the model fit appeared satisfactory, but upon closer examination, an item (EA03) was deemed unsuitable due to its low factor loading (0.37); hence, subjective judgment, supported by input from a subject expert, guided the decision to remove the item (EA03). After removing item no. EA03, we conducted another CFA, which showed a significant decrease in the chi-square value in Model II, from 212.88 in the initial model to 173.00. This reinforces the validity of our factor structure, which demonstrated a good model fit, as indicated in Table 5 (CFA default model II). The values of RMR, GFI, AGFI, CFI, and RMSEA, met the threshold criteria suggested by Fornell and Larker (1981), thereby reinforcing the validity of the theoretical model. Schumacker and Lomax (2016) suggest that when most fit indices meet or exceed these thresholds, it is reasonable to infer that the data support the theoretical model. Therefore, the standard fit indices confirm the validity of the factor structure of the model.

Table 5 Model fit indices of confirmatory factor analysis for STEM Interest Scale

Stage 3: Scale Evaluation

Multidimensionality and Reliability

The application of CFA in the previous step illuminated the multi-dimensionality of the construct, providing a favorable indication to proceed with the reliability analysis of the scale. Composite reliability (CR) and Cronbach’s alpha coefficient were used to examine the internal consistency. Across the sub-dimensions of the scale, Cronbach’s alpha values ranged from 0.680 to 0.758 (refer to Table 6). Moreover, the STEM Interest Scale displayed composite reliability with CR = 0.803, further validating the scale (Cronbach, 1951). Therefore, it can be implicit that the STEM Interest Scale demonstrates a significant level of convergence. The current scale exhibits robust construct reliability, consistent with the criteria delineated by Fornell and Larcker (1981).

Table 6 Reliability statistics for dimensions of STEM Interest Scale

Construct Validity of STEM Interest Scale

The construct validity of the present scale was assessed with average variance extracted (AVE) and composite reliability (CR) as per the suggestion given by Fornell and Larcker (1981). The average variance extracted (AVE) was found to be 0.506, and the composite reliability of the scale was found to be 0.803, which is up to the threshold value. The present scale ensures convergent validity as composite reliability (CR) > average variance extracted (AVE) > 0.5.

Common Method Bias

Harman’s single-factor matrix was employed to assess common method variance, as recommended by Aulakh and Gencturk (2000). The exploratory factor analysis (EFA) outcomes revealed that none of the four factors accounted for more than 50% of the variance. Consequently, the study did not exhibit common method bias, as per the suggestion of Peng et al., (2006).

Nomological Validity

In the present research, the nomological validity of the proposed scale was established with the ‘intention to drop out’ (Fig. 2). We employed a two-item scale adapted from Xu and Webber (2018) to measure the intention to drop. The data was collected in phase II using a five-point Likert scale to confirm the strength of the nomological network. The collected data were analyzed using AMOS 29. The findings indicate that there exists a negative correlation between STEM Interest and intention to drop out (Fig. 2). The structural model illustrates the relationships between dimensions of STEM interest and their impact on the intent to drop out from STEM courses. The path coefficients indicate the strength and direction of these relationships.

The existing body of literature anticipates diverse consequences of STEM interest, including STEM stereotypes (Luo & So, 2023; Luo et al., 2021), STEM Learning experience (Chachashvili-Bolotin et al., 2016), and intent to drop out (van den Hurk et al., 2019). Therefore, the finding aligns with the outcome of nomological validity.

Fig. 2
figure 2

Structural Equation Model to assess the nomological validity of STEM Interest with Intent to Dropout

The psychometric characteristics of the construct used to form a nomological network were assessed, revealing reliability coefficients of 0.798 for STEM Interest and 0.711 for intent to leave. This model (Fig. 2) was evaluated using average variance extracted (AVE), Composite Reliability, and discriminant validity following the guidelines of Fornell and Larcker (1981). Table 7 displays the average variance extracted (AVE), composite reliability (CR), maximum shared squared variance (MSV), and average shared squared variance (ASV) for both the intent to drop and STEM Interest constructs. The results show that CR > AVE > 0.5 indicates a high level of convergent validity within the model. To ensure discriminant validity, the Heterotrait-Monotrait ratio of correlations (HTMT) criteria (displayed in Table 8) was applied, with an acceptable value below 0.90 (Henseler et al., 2015). The findings of structural equation modeling revealed a favorable fit for the model (RMSEA = 0.046, GFI = 0.949, AGFI = 0.932, CFI = 0.941, χ2 = 242.67, χ2/df = 2.129), no adjustments were made to the model fit due to its predetermined relationships. The standard estimates for the path STEM Interest → Intent to dropout were − 0.17, signifying significance at the 1% level, suggesting a negative correlation. It can be concluded that having an interest in STEM predicts persistence in the field, as measured through intent to dropout rates, thus verifying the validity of the scale.

Table 7 Summary of convergent validity of STEM Interest Scale and intention to dropout
Table 8 Heterotrait-Monotrait ratio to assess discriminant validity of STEM Interest Scale

Discussion

The discourse on STEM interest within scholarly literature has predominantly focused on career-related or content-specific aspects, often viewing interest through an outcome lens, such as career prospects. However, existing scales purportedly measuring STEM interest lack robust support for construct validity, with their psychometric properties typically tested solely through exploratory factor analysis (EFA). For instance, the Bitara-STEM scale (Mohd Shahali et al., 2019) and the STEM semantics survey developed by Tyler-Wood, Knezek, and Christensen (Tyler-Wood et al., (2010), both employed EFA and primarily assessed learners’ interest in STEM-related careers or perceptions of STEM disciplines among middle school students.

Addressing these limitations, the present research endeavors to develop and validate a comprehensive instrument to measure STEM interest by adopting an established scale development procedure (Hinkin, 2005). Through a two-phase process involving exploratory and confirmatory factor analyses (Churchill, 1979; Hinkin, 1995), the resulting STEM Interest Scale (SIS) emerged as a multi-dimensional construct. The resultant STEM Interest Scale (SIS) consists of four distinct psychological factors: intrinsic motivation, self-efficacy, self-concept, and career aspirations. These factors collectively accounted for a significant portion of the total variance and demonstrated satisfactory reliability and validity.

Aligned with theoretical frameworks such as self-determination theory and social cognitive career theory, the SIS captures the various dimensions that influence students’ interest and engagement in STEM disciplines (Bandura, 1986; Hidi & Renninger, 2006; Lent et al., 1994). Consequently, it offers a nuanced and meaningful assessment of learners' interest level in STEM discipline, providing valuable insights for both researchers and practitioners. For researchers, the SIS serves as a valuable tool to investigate the antecedents, consequences, and correlates of STEM interest, aiding in the identification of intervention opportunities to bolster STEM engagement among students.

The proposed scale has both theoretical and practical implications as it provides valuable insights into students' motivations, self-concept, efficacy, and aspirations within STEM fields. This facilitates policymakers, educators, and counselors in designing and delivering effective curricula, STEM education programs, and interventions. However, its effectiveness can only be determined through a longitudinal study. The SIS facilitates a deeper understanding of students' interest in STEM fields by elucidating the nomological network between STEM interest and intent to drop out. Ultimately, it supports educational institutions to improve participation and success rates in STEM education.

Conclusion

The development and validation of the STEM Interest scale (SIS) have elucidated that it is a multifaceted psychological construct. Its multidimensional nature provides valuable insights beyond traditional outcome-oriented perspectives, such as career interests. By exploring dimensions such as intrinsic motivation, self-efficacy, self-concept, and employment aspiration, this scale offers a nuanced understanding of STEM interest, aligning with contemporary theories and frameworks. Its validity and reliability have been confirmed through rigorous psychometric analyses, including both exploratory and confirmatory factor analyses, distinguishing it from other instruments in the field.

Furthermore, the study’s sample predominantly consists of expatriates, which enhances the research by incorporating a wide range of nationalities. This heterogeneity extends the scale's generalizability, allowing for its application across various geographic and cultural contexts with minimal modifications. The STEM Interest scale plays a key role in encouraging wider participation and sustained commitment in STEM fields, which is essential for developing a skilled and diverse workforce in these critical sectors.