Introduction

Worldwide, STEM education, which integrates the disciplines of science, technology, engineering, and math, is gaining popularity in K-12 settings due to its capacity to enhance 21st-century skills such as adaptability, problem-solving, and creative thinking (National Research Council [NRC], 2015). In STEM lessons, students are frequently guided by the engineering design process, which involves identifying problems or technical challenges and creating and developing solutions. Furthermore, higher achievement in STEM education has been linked to increased enrolment in post-secondary STEM fields, offering students greater opportunities to pursue careers in these domains (Merrill & Daugherty, 2010). However, STEM activities require dedicated time and the restructuring of integrated curricula, necessitating careful organization of lessons. Recognizing the complexity of developing 21st-century STEM proficiency, schools are not expected to tackle this challenge alone. In addition to regular STEM classes, there exists a diverse range of extended education programs, activities, and out-of-school learning environments (Baran et al., 2016; Kalkan & Eroglu, 2017; Schweingruber et al., 2014). In this paper, out-of-school learning environments, informal learning environments, extended education, and afterschool programs were used synonymously. It is worth noting that the literature lacks a universally accepted definition for out-of-school learning environments, leading to the use of various interchangeable terms (Donnelly et al., 2019). Some of these terms include informal learning environments, extended education, afterschool programs, all-day school, extracurricular activities, out-of-school time learning, extended schools, expanded learning, and leisure-time activities. These terms refer to optional programs and clubs offered by schools that exist outside of the standard academic curriculum (Baran et al., 2016; Cooper, 2011; Kalkan & Eroglu, 2017; Schweingruber et al., 2014).

Out-of-school learning, in contrast to traditional in-school learning, offers greater flexibility in terms of time and space, as it is not bound by the constraints of the school schedule, national or state standards, and standardized tests (Cooper, 2011). Out-of-school learning experiences typically involve collaborative engagement, the use of tools, and immersion in authentic environments, while school environments often emphasize individual performance, independent thinking, symbolic representations, and the acquisition of generalized skills and knowledge (Resnick, 1987). They encompass everyday activities such as family discussions, pursuing hobbies, and engaging in daily conversations, as well as designed environments like museums, science centres, and afterschool programs (Civil, 2007; Hein, 2009). On the other hand, extended education refers to intentionally structured learning and development programs and activities that are not part of regular classes. These programs are typically offered before and after school, as well as at locations outside the school (Bae, 2018). As a result, out-of-school learning environments encompass a wide range of experiences, including social, cultural, and technical excursions around the school, field studies at museums, zoos, nature centres, aquariums, and planetariums, project-based learning, sports activities, nature training, and club activities (Civil, 2007; Donnelly et al., 2019; Hein, 2009). At this point, STEM clubs are a specialized type of extracurricular activity that engage students in hands-on projects, experiments, and learning experiences related to scientific, technological, engineering, and mathematical disciplines. STEM Clubs, described as flexible learning environments unconstrained by time or location, offer an effective approach to conducting STEM studies outside of school (Blanchard et al., 2017; Cooper, 2011; Dabney et al., 2012).

Out-of-school learning environments, extended education or afterschool programs, hold tremendous potential for enhancing student learning and providing them with a diverse and enriching educational experience (Robelen, 2011). Extensive research supports the notion that these alternative educational programs not only contribute to students' academic growth but also foster their social, emotional, and intellectual development (NRC, 2015). Studies have consistently shown that after-school programs play a vital role in boosting students' achievement levels (Casing & Casing, 2024; Pastchal-Temple, 2012; Shernoff & Vandell, 2007), and contributing to positive emotional development, including improved self-esteem, positive attitudes, and enhanced social behaviour (Afterschool Alliance, 2015; Durlak & Weissberg, 2007; Lauer et al., 2006; Little et al., 2008). Moreover, engaging in various activities within these programs allows students to develop meaningful connections, expand their social networks, enhance leadership skills (Lipscomb et al., 2017), and cultivate cooperation, effective communication, and innovative problem-solving abilities (Mahoney et al., 2007).

Implementing STEM activities in out-of-school learning environments not only supports students in making career choices and fostering meaningful learning and interest in science, but also facilitates deep learning experiences (Bybee, 2001; Dabney et al., 2012; Sahin et al., 2018). Furthermore, STEM Clubs enhance students' emotional skills, such as a sense of belonging and peer-to-peer communication, while also fostering 21st-century skills, facilitating the acquisition of current content, and promoting career awareness and interest in STEM professions (Blanchard et al., 2017). In summary, engaging in STEM activities through social club activities not only addresses time constraints but also complements formal education and contributes to students' overall development. Hence, STEM Clubs, which are part of extended education, can be defined as dynamic and flexible learning environments that provide an effective approach to conducting STEM studies beyond traditional classroom settings. These clubs offer flexibility in terms of time and location, with intentionally structured programs and activities that take place outside of regular classes. They provide students with unique opportunities to explore and deepen their understanding of STEM subjects through collaborative engagement, hands-on use of tools, and immersive experiences in authentic environments (Bae, 2018; Blanchard, et al., 2017; Bybee, 2001; Cooper, 2011; Dabney et al., 2012). STEM Clubs have gained immense popularity worldwide, providing students with invaluable opportunities to explore and cultivate their interests and knowledge in these crucial fields (Adams et al., 2014; Bell et al., 2009). According to America After 3PM, nearly 75% of afterschool program participants, around 5,740,836 children, have access to STEM learning opportunities (Afterschool Alliance, 2015).

STEM Clubs as after-school programs come in various forms and provide diverse tutoring and instructional opportunities. For instance, the Boys and Girls Club of America (BGCA) operates in numerous cities across the United States, annually serving 4.73 million students (Boys and Girls Club of America, 2019). This program offers students the chance to engage in activities like sports, art, dance, field trips, and addresses the underrepresentation of African Americans in STEM. Another example is the Science Club for Girls (SCFG), established by concerned parents in Cambridge to address gender inequity in math, science, and technology courses and careers. SCFG brings together girls from grades K–7 through free after-school or weekend clubs, science explorations during vacations, and community science fairs, with approximately 800 to 1,000 students participating each year. The primary goal of these clubs is to increase STEM literacy and self-confidence among K–12 girls from underrepresented groups in these fields. More examples can be found in the literature, such as the St. Jude STEM Club (SJSC), where students conducted a 10-week paediatric cancer research project using accurate data (Ayers et al., 2020), and After School Matters, based in Chicago, offers project-based learning that enhances students' soft skills and culminates in producing a final project based on their activities (Hirsch, 2011).

The Purpose of The Study

The literature on STEM Clubs indicates a diverse range of such clubs located worldwide, catering to different student groups, operating on varying schedules, implementing diverse activities, and employing various strategies, methodologies, experiments, and assessments (Ayers et al., 2020; Blanchard et al., 2017; Boys and Girls Club of America, 2019; Hirsch, 2011; Sahin et al., 2018). However, it was previously unknown which specific sample groups were most commonly studied, which analytical methods were used frequently, and which results were primarily reported, even though the overall topic of STEM Clubs has gained significant attention. Therefore, organizing and categorizing this expansive body of literature is necessary to gain deeper insights into the current state of knowledge and practices in STEM Clubs. By systematically reviewing and synthesizing the diverse range of studies on this topic, we can develop a clearer understanding of the focus areas, methodologies, and key findings that have emerged from the existing research (Fraenkel et al., 2012). At this point, using a content analysis method is appropriate for this purpose because this method is particularly useful for examining trends and patterns in documents (Stemler, 2000). Similarly, some previous research on STEM education has conducted content analyses to examine existing studies and construct holistic patterns to understand trends (Bozkurt et al., 2019; Chomphuphra et al., 2019; Irwanto et al., 2022; Li et al., 2020; Lin et al., 2019; Martín-Páez et al., 2019; Noris et al., 2023). However, there is a lack of content analysis specifically focused on studies of STEM Clubs in the literature and showing the trends in this topic. Analysing research trends in STEM Clubs can help build upon existing knowledge, identify gaps, explore emerging topics, and highlight successful methodologies and strategies (Fraenkel et al., 2012; Noris et al., 2023; Stemler, 2000). This information can be valuable for researchers, educators, and policymakers to stay up-to-date and make informed decisions regarding curriculum design (Bozkurt et al., 2019; Chomphuphra et al., 2019; Irwanto et al., 2022; Li et al., 2020; Lin et al., 2019; Martín-Páez et al., 2019; Noris et al., 2023), the development of effective STEM Club programs, resource allocation, and policy formulation (Blanchard et al., 2017; Cooper, 2011; Dabney et al., 2012). Therefore, the identification of research trends in STEM Clubs was the aim of this study.

To identify research trends, studies commonly analysed documents by considering the dimensions of articles such as keywords, publishing years, research designs, purposes, sample levels, sample sizes, data collection tools, data analysis methods, and findings (Bozkurt et al., 2019; Chomphuphra et al., 2019; Irwanto et al., 2022; Li et al., 2020; Sozbilir et al., 2012). Using these dimensions as a framework is a useful and common approach in content analysis because this framework allows researchers to systematically examine the key aspects of existing studies and uncover patterns, relationships, and trends within the research data (Sozbilir et al., 2012). Hence, since the aim of this study is to identify and analyse research trends in STEM Clubs, it focused on publishing years, keywords, research designs, purposes, sample levels, sample sizes, data collection tools, data analysis methods, and findings of the studies on STEM Clubs.

As a conclusion, the main problem of this study is “What are the characteristics of the studies on STEM Clubs?”. The following sub-questions are addressed in this study:

  1. (1)

    What is the distribution of studies on STEM Clubs by year?

  2. (2)

    What are the frequently used keywords in studies on STEM Clubs?

  3. (3)

    What are the commonly employed research designs in studies on STEM Clubs?

  4. (4)

    What are the typical purposes explored in studies on STEM Clubs?

  5. (5)

    What are the commonly observed sample levels in studies on STEM Clubs?

  6. (6)

    What are the commonly observed sample sizes in studies on STEM Clubs?

  7. (7)

    What are the commonly utilized data collection tools in studies on STEM Clubs?

  8. (8)

    What are the commonly utilized data analysis methods in studies on STEM Clubs?

  9. (9)

    What are the typical durations reported in studies on STEM Clubs?

  10. (10)

    What are the commonly reported findings in studies on STEM Clubs?

Method

In this study, the descriptive content analysis research method was employed, which allows for a systematic and objective examination of the content within articles, and description of the general trends and research results in a particular subject matter (Lin et al., 2014; Suri & Clarke, 2009; Sozbilir et al., 2012; Stemler, 2000). Given the aim of examining research trends in STEM Clubs, the utilization of this method was appropriate, as it provides a structured approach to identify patterns and trends (Gay et al., 2012). To implement the content analysis method, this study followed the three main phases proposed by Elo and Kyngäs (2008): preparation, organizing, and reporting. In the preparation phase, the unit of analysis, such as a word or theme, is selected as the starting point. So, in this study, the topic of STEM Clubs was carefully selected. During the organizing process, the researcher strives to make sense of the data and to learn "what is going on" and obtain a sense of the whole. So, in this study, during the analysis process, the content analysis framework (sample levels, sample sizes, data collection tools, research designs, etc.) was used to question the collected studies. Finally, in the reporting phase, the analyses are presented in a meaningful and coherent manner. So, the analyses were presented meaningfully with visual representations such as tables, graphs, etc. By adopting the content analysis research method and following the suggested phases, this study aimed to gain insights into research trends in STEM Clubs, identify recurring themes, and provide a comprehensive analysis of the collected data.

Search and Selection Process

The online databases ERIC and Web of Science were searched using keywords derived from a database thesaurus. These databases were chosen because of their widespread recognition and respect in the fields of education and academic research, and they offer a substantial amount of high-quality, peer-reviewed literature. The search process involved several steps. Firstly, titles, abstracts, and keywords were searched using Boolean operators for the keywords "STEM Clubs," "STEAM Clubs," "science-technology-engineering-mathematics clubs," "after school STEM program" and "extracurricular STEM activities" in the databases (criterion-1). Secondly, studies were collected beginning from November to the end of December 2023. So, the studies published until the end of December 2023 were included in the search, without a specific starting date restriction (criterion-2). Thirdly, the search was limited to scientific journal articles, book chapters, proceedings, and theses, excluding publications such as practices, letters to editors, corrections, and (guest) editorials (criterion-3). Fourthly, studies published in languages other than English were excluded, focusing exclusively on English language publications (criterion-4). Fifthly, duplicate articles found in both databases were identified and removed. Next, the author read the contents of all the studies, including those without full articles, with a particular focus on the abstract sections. After that, studies related to after school program and extracurricular activities that did not specifically involve the terms STEM or clubs were excluded, even though “extracurricular STEM activities” and “after school STEM program” were used in the search process, and there were studies related to after school program or extracurricular activities but not STEM (criterion-5). Additionally, studies conducted in formal and informal settings within STEM clubs were included, while studies conducted in settings such as museums or trips were excluded (criterion-6). Because STEM Clubs are a subset of informal STEM education settings, which also include museums and field trips, the main focus of this study is to show the trends specifically related to STEM Clubs. Moreover, studies focusing solely on technology without incorporating other STEM components were also excluded (criterion-7). Finally, 56 publications that met the inclusion and extraction criteria were identified. These publications comprised two dissertations, seven proceedings, and 47 articles from 36 different journals. By applying these criteria, the search process aimed to ensure the inclusion of relevant studies while excluding those that did not meet the specified criteria as shown in Fig. 1.

Fig. 1
figure 1

Flowchart of article process selection

Data Analysing Process

Two different approaches were followed in the content analysis process of this study. In the first part, deductive content analysis was used, and a priori coding was conducted as the categories were established prior to the analysis. The categorization matrix was created based on the Paper Classification Form (PCF) developed by Sozbilir et al. (2012). The coding scheme devised consisted of eight classification groups for the sections of publication years, keywords, research designs, sample levels, sample sizes, data collection tools, data analysis methods, and durations, with sub-categories for each section. For example, under the research designs section, the sub-categories included qualitative and quantitative methods, case study, design-case study, comparative-case study, ethnographic study, phenomenological study, survey study, experimental study, mixed and longitudinal study, and literature review study. These sub-categories were identified prior to the analysis. Coding was then applied to the data using spreadsheets in the Excel program, based on the categorization matrix. Frequencies for the codes and categories created were calculated and presented in the findings section with tables. Line charts were used for the publication years section, while word clouds, which visually represent word frequency, were used for the keywords section. Word clouds display the most frequently used words in different sizes and colours based on their frequencies (DePaolo & Wilkinson, 2014). So, in this part, the analysis was certain since the studies mostly provided related information in their contents.

In the second part, open coding and the creation of categories and abstraction phases were followed for the purposes and findings sections. Firstly, the stated purposes and findings of the studies were written as text. The written text was then carefully reviewed, and any necessary terms were written down in the margins to describe all aspects of the content. Following this open coding, the lists of categories were grouped under higher order headings, taking into consideration their similarities or dissimilarities. Each category was named using content-characteristic words. The abstraction process was repeated to the extent that was reasonable and possible. In this coding process, two individuals independently reviewed ten studies, considering the coding scheme for the first part and conducting open coding for the second part. They then compared their notes and resolved any differences that emerged during their initial checklists. Inter-rater reliability was calculated as 0.84 using Cohen's kappa analysis. Once coding reliability was ensured, the remaining articles were independently coded by the author. After completing the coding process, consensus was reached through discussions regarding any disagreements among the researchers regarding the codes, as well as the codes and categories constructed for the purpose and findings sections. At this point, there were mostly agreements in the coding process since the studies had already clearly stated their key characteristics, such as research design, sample size, sample level, and data collection tools. Additionally, when coding the studies' stated purposes and results, the researchers closely referred to the original sentences in the studies, which led to a high level of consistency in the coded content between the two raters.

Results

Years

Studies related to the STEM Clubs were initially conducted in 2009 (Fig. 2). The noticeable increase in the number of studies conducted each year is remarkable. It can be seen that the majority of the 47 articles that were examined (56 articles) were published after 2015, despite a decrease in the year 2018. Additionally, it was observed that the articles were most frequently published (8) in the years 2019 and 2022, least frequently (1) in the years 2009, 2010, and 2014, and there were no publications in 2012.

Fig. 2
figure 2

Number of articles by years

Keywords

Word clouds were utilized to present the most frequently used keywords in the articles, as shown in Fig. 3. However, due to the lack of reported keywords in the ERIC database, only 30 articles were included for these analyses. The keywords that exist in these studies were represented in a word cloud in Fig. 3. The most frequently appearing keywords, such as "STEM," "education" and "learning" were identified. Additionally, by using a content analysis method, these keywords were categorized into six different groups: disciplines, technological concepts, academic community, learning experiences, core elements of education, and psychosocial factors (variables) in Table 1.

Fig. 3
figure 3

Word cloud of the keywords used in articles

Table 1 The content analysis of keywords used in articles

Purposes

The purposes of the identified studies identified were classified into six main themes: “effects of participation in STEM Clubs on” (25), “evolution of a sample program for STEM Clubs and its implementation” (25), “examination of” (11), “identification of” (3), “comparison of in-school and out-school STEM experiences” (2) and “others” (6). Table 2 presents the distribution of the articles’ purposes based on the classification regarding these themes. Therefore, it can be seen that purposes of “effects of participation in STEM Clubs on,” and “evolution of a sample program for STEM Clubs and its implementation” were given the highest and equal consideration, while the purposes related to "identification of" (3) and "comparison of in-school and out-of-school STEM experiences" (2) were given the least consideration among them.

Table 2 The distribution of the purposes in articles

Within the theme of "effects of participation in STEM Clubs on" there are 11 categories. The aims of the studies in this section are to examine the effect of participation in STEM Clubs on various aspects such as attitudes towards STEM disciplines or career paths, STEM major choice/career aspiration, achievement in math, science, STEM disciplines, or content knowledge, perception of scientists, strategies used, value of clubs, STEM career paths, enjoyment of physics, use of complex and scientific language, interest in STEM, creativity, critical thinking about STEM texts, images of mathematics, or climate-change beliefs/literacy. It is evident that the majority of research in this section focuses on the effects of participation in STEM Clubs on STEM major choice/career aspiration (5), achievement (4), perception of something (4), and interest in STEM (3).

Within the theme of "evolution of a sample program for STEM Clubs and its implementation" there are three categories: development of program/curriculum/activity (14), identification of program's challenges and limitations (3), and implementation of program/activity (8). The studies in this section aim to develop a sample program for STEM Clubs and describe its implementation. It can be seen that the most preferred purpose among them is the development of program/curriculum/activity (14), while the least preferred purpose is the identification of program's challenges and limitations (3). In addition, studies that focus on the development of the program, curriculum, or activity were classified under the "general" category (10). Sub-categories were created for studies specifically expressing the development of the program with a focus on a particular area, such as the maker movement or Arduino-assisted robotics and coding. Similarly, studies that explicitly mentioned the development of the program based on presented ideas and experiences formed another sub-category. Furthermore, the category related to the implementation of program/activity was divided into eight sub-categories, each indicating the specific centre of implementation, such as problem-based learning-centred and representation of blacks-centred.

The theme of "examination of" refers to studies that aim to examine certain aspects, such as the experiences and perceptions of students (7) and the factors influencing specific subjects (4). Studies focusing on examining the experiences and perceptions of students were labelled as "general" (4), while studies exploring their experiences and perceptions regarding specific content, such as influences and challenges to participation in STEM clubs (2) and assessment (1), were labelled accordingly. Additionally, studies that focused on examining factors affecting the choice of STEM majors (2), participation in STEM clubs (1), and motivation to develop interest in STEM (1) were categorized in line with their respective focuses. As shown in Table 2, it is evident that studies focusing on examining the experiences and perceptions of students (7) were more frequently conducted compared to studies focusing on examining the factors affecting specific subjects (4).

The theme of "identification of" refers to studies that aim to identify certain aspects, such as the types of attitudinal effects (1), types of changes in affect toward engineering (1), and non-academic skills (1). Additionally, the theme of "comparison of in-school and out-of-school STEM experiences" (2) refers to studies that aim to compare STEM experiences within school and outside of school. Lastly, studies that did not fit into the aforementioned categories were included in the "others" theme (6) as no clear connection could be identified among them.

Research Designs

The research designs employed in the examined articles were identified as follows: qualitative methods (36), including case study (20), design-case study (6), comparative-case study (4), ethnographic study (2), phenomenological study (2), and survey study (2); quantitative methods (7), including survey study (4) and experimental study (3); mixed methods and longitudinal studies (10); and literature review (3), as illustrated in Table 3. It can be observed that among these methods, case study was the most commonly utilized. Furthermore, it is evident that quantitative methods (7) and literature reviews (3) were employed less frequently compared to qualitative (36) and mixed methods (10). Additionally, survey studies were utilized in both quantitative and qualitative studies.

Table 3 The distribution of the research designs in articles

Sample Levels

The frequencies and percentages of sample levels in the examined articles are presented in Table 4. The studies involved participants at different educational levels, including elementary school (8), middle school (23), high school (14), pre-service teachers or undergraduate students (6), teachers (4), parents (3), and others (1). It is apparent that middle school students (23) were the most commonly utilized sample among them, while high school students (14) were more frequently chosen compared to elementary school students (8). It should be noted that while grade levels were specified for both elementary and middle school students, separate grade levels were not identified for high school students in these studies. Additionally, studies that involved mixed groups were labelled as 3-5th and 6-8th grades. However, when the mixed groups included participants from different educational levels such as elementary, middle, or high school, teachers, parents, etc., they were counted as separate levels. Furthermore, the studies conducted with participants such as pre-service teachers, undergraduates, teachers, and parents were less frequently employed compared to K-12 students.

Table 4 The distribution of the sample levels in articles

Sample Sizes

The frequencies of sample sizes in the examined articles are presented in Table 5. It was observed that in 15 studies, the number of sample sizes was not provided. The intervals for the sample size were not equally separated; instead, they were arranged with intervals of 5, 10, 50, and 100. This choice was made to allow for a more detailed analysis of smaller samples, as smaller intervals can provide a more granular examination of data instead of cumulative amounts. The analysis reveals that the studies primarily prioritized sample groups with 11–15 (f:8) participants, followed by groups of 16–20 (f:4) and 201–250 (f:4). Additionally, it is evident that sample sizes of 6–10, 21–25, 41–50, 50–100, and more than 2000 (f:1) were the least commonly studied.

Table 5 The distribution of the sample sizes in articles

Data Collection Tools

The frequencies and percentages of data collection tools in the examined articles are presented in Table 6. The analysis reveals that the studies primarily employed survey or questionnaires (31.6%) and observations (30.5%) as data collection methods, followed by interviews (15.8%), documents (13.7%), tests (4.2%), and field notes (4.2%). Regarding survey/questionnaires, Likert-type scales (f:23) were more commonly employed compared to open-ended questions (f:7). Tests were predominantly used as achievement tests (f:2) and assessments (f:2), representing the least preferred data collection tools. Furthermore, the table illustrates that multiple data collection tools were frequently employed, as the total number of tools (95) is nearly twice the number of studies (56).

Table 6 The distribution of the data collection tools in articles

Data Analysing Methods

The frequencies and percentages of data analysing methods in the examined articles are presented in Table 7. The table reveals that the studies predominantly employed descriptive analysis (f:33, 41.25%), followed by inferential statistics (f:16, 20%), descriptive statistics (f:15, 18.75%), content analysis (f:14, 17.5%), and the constant-comparative method (f:2, 2.5%). It is notable that qualitative methods (f:49, 61.25%) were preferred more frequently than quantitative methods (f:31, 38.75%) in the examined studies related to STEM Clubs. Within the qualitative methods, descriptive analysis (f:33) was utilized nearly twice as often as content analysis (f:14), while within the quantitative methods, descriptive statistics (f:15) and inferential statistics (f:16), including t-tests, ANOVA, regression, and other methods, were used with comparable frequency.

Table 7 The distribution of the data analysing methods in articles

Durations

The durations of STEM Clubs in the examined studies are presented in Table 8. Based on the analysis, there are more studies (f:37) that do not state the duration of STEM Clubs than studies (f:19) that do provide information on the durations. Additionally, among the studies that do state the durations, there is no common period of time for STEM Clubs, as they were implemented for varying numbers of weeks and sessions, with session durations ranging from several minutes. Therefore, it can be observed that STEM Clubs were conducted over the course of 3 semesters (academic year and summer), 5 months, 2 to 16 weeks, with session durations ranging from 60 to 120 min. Furthermore, the durations of "3 semesters," "10 weeks with 90-min sessions per week," and "unknown weeks with 60-min sessions per week" were used more than once in the studies.

Table 8 The distribution of the durations in STEM Clubs

Findings

The content analysis of the findings of the identified examined articles are presented by their frequencies in Table 9. Although the studies cover a diverse range of topics, the analysis indicates that the results can be broadly classified into three themes, namely, the "development of or increase in certain aspects" (f:68), "design of STEM Clubs" (f:17), and "identification of various aspects" (f:16). Based on the analysis, the findings in the studies are associated with the development of certain aspects such as skills or the increase in specific outcomes like academic achievement. Furthermore, the studies explore the design of STEM Clubs through the description of specific cases, such as sample implementations and challenges. Additionally, the studies focus on the identification of various aspects, such as factors and perceptions.

Table 9 The distribution of the findings in articles

It is evident from the findings that the studies predominantly yield results related to the development of or increase in certain aspects (f:68). Within this theme, the most commonly observed result is the development of STEM or academic achievement or STEM competency (f:11). This is followed by an increase in STEM major choice or career aspiration (f:9), an increase in engagement or participation in STEM clubs (f:5), the development of identity including STEM, science, engineering, under-representative groups (f:5), the development of interest in STEM (f:4), an increase in enjoyment (f:4), and the development of collaboration, leadership, or communication skills (f:4). Furthermore, it can be observed that there are some results, such as the development of critical thinking, perseverance and the teachers’ profession, that were yielded less frequently (f:1). The results of 16 studies were found with a frequency of 1.

Within the design of STEM Clubs, the sample implementation or design model for different purposes such as the usage of robotic program or students with disabilities (f:7), design principles or ideas for STEM clubs, activities or curriculum (f:4), challenges or factors effecting STEM Clubs success and sustainability (f:3) were presented as a result. Additionally, the comparison was made between in-school and out-of-school learning environments (f:3), highlighting the contradictions of STEM clubs and science classes, as well as the differences in STEM activities and continues-discontinues learning experiences in mathematics. Within the identification of various aspects, the most commonly gathered result was the identification of factors affecting participation or motivation to STEM clubs (f:5). This was followed by the identification of barriers to participation (f:2). The identification of other aspects, such as parents' roles and perspectives on STEM, was comparatively less frequent.

Discussion

Considering the wide variety of STEM Clubs found in different regions around the world, this study aimed to investigate the current state of research on STEM Clubs. It is not surprising to observe an increase in the number of studies conducted on STEM Clubs over the years. This can be attributed to the overall growth in research on STEM education (Zhan et al., 2022), as STEM education often includes activities and after-school programs as integral components (Blanchard et al., 2017). Identifying relevant keywords and incorporating them into a search strategy is crucial for conducting a comprehensive and rigorous systematic review (Corrin et al., 2022). To gain a broader understanding of keyword usage in the context of STEM Clubs, a word cloud analysis was performed (McNaught & Lam, 2010). Additionally, based on the content analysis method, six different categories for keywords were immerged: disciplines, technological concepts, academic community, learning experiences, core elements of education, and psychosocial factors (variables). The analysis revealed that the keyword "STEM" was used most frequently in the studies examined. This may be because authors want their studies to be easily found and widely searchable by others, so they use "STEM" as a general term for their studies (Corrin et al., 2022). Similarly, the frequent use of keywords like "education" and "learning" from the "core elements of education" category could be attributed to authors' desire to use broad, searchable terms to make their studies more discoverable (Corrin et al., 2022). Additionally, it was observed that from the STEM components, only "science" and "engineering" were used as keywords, while "mathematics" and "technology" were not present. This finding aligns with claims in the literature that mathematics is often underemphasized in STEM integration (Fitzallen, 2015; Maass et al., 2019; Stohlmann, 2018). Although the specific term "technology" did not appear in the word cloud, technology-related keywords such as "arduino," "robots," "coding," and "innovative" were present. Furthermore, the analysis revealed that authors preferred to use keywords related to their sample populations, such as "middle (school students)," "elementary (students)," "high school students," or "teachers." Additionally, keywords describing learning experiences, such as "extracurricular," "informal," "afterschool," "out-of-school," "social," "clubs," and "practice" were commonly used. This preference may stem from the fact that STEM clubs are often part of informal learning environments, out-of-school programs, or afterschool activities, and these concepts are closely related to each other (Baran et al., 2016; Cooper, 2011; Kalkan & Eroglu, 2017; Schweingruber et al., 2014). Moreover, the analysis showed that keywords related to psychosocial factors (variables), such as "disabilities," "skills," "interest," "attainment," "enactment," "expectancy-value," "self-efficacy," "engagement," "motivation," "career," "gender," "cognitive," and "identity" were also prevalent. This suggests that the articles investigated the effects of STEM club practices on these psychosocial variables. To sum up, by using these keywords, researchers can gain valuable insights and effectively search for relevant articles related to STEM clubs, enabling them to locate appropriate resources for their research (Corrin et al., 2022).

The popularity of case studies as a research design, based on the analysis, can be attributed to the fact that studies on STEM Clubs were conducted in diverse learning environments, highlighting sample implementation designs (Adams et al., 2014; Bell et al., 2009; Robelen, 2011). At this point, case studies offer the opportunity to present practical applications and real-world examples (Hamilton & Corbett-Whittier, 2012), which is highly valuable in the context of STEM Clubs. Additionally, the observation that quantitative methods were not as commonly utilized as qualitative methods in studies related to STEM Clubs contrasts with the predominant reliance on quantitative methods in STEM education research (Aslam et al., 2022; Irwanto et al., 2022; Lin et al., 2019). This suggests a lack of quantitative studies specifically focused on STEM Clubs, indicating a need for more research in this area employing quantitative approaches. Therefore, it is important to prioritize and conduct additional quantitative studies to further enhance our understanding of STEM Clubs and their impact. In studies on STEM Club, there is a higher frequency of research involving K-12 students, particularly middle school students, parallel to some studies on literature (Aslam et al., 2022), compared to other groups such as pre-service teachers, undergraduate students, teachers, and parents. This can be attributed to the fact that STEM Clubs are designed for K-12 students, and middle school is a crucial period for introducing them to STEM concepts and careers. Middle school students are developmentally ready for hands-on and inquiry-based learning, commonly used in STEM education. Additionally, time constraints, especially for high school students preparing for university, may limit their involvement in extensive STEM activities. Furthermore, STEM Clubs were primarily employed with sample groups ranging from 11–15, 16–20, and 201–250 participants. The preference for 11–20 participants, rather than less than 10, may be attributed to the collaborative nature of STEM activities, which often require a larger team for effective teamwork and group dynamics (Magaji et al., 2022). Utilizing small groups as samples can result in the case study research design being the most frequently employed approach due to its compatibility with smaller sample sizes. On the other hand, the inclusion of larger groups (201–250) is suitable for survey studies, as this number can represent the total student population attending STEM Clubs throughout a semester with multiple sessions (Boys & Girls Club of America, 2019).

According to studies on STEM Clubs, surveys or questionnaires and observations were predominantly used as data collection methods. This preference can be attributed to the fact that surveys or questionnaires allow researchers to gather data on diverse aspects, including students' attitudes, perceptions, and experiences related to STEM Clubs, facilitating generalization and comparison (McLafferty, 2016). Furthermore, observations were frequently employed because they can offer a deeper understanding of the lived experiences and actual practices within STEM Clubs (Baker, 2006). Along with data collection tools, descriptive analysis was predominantly utilized in studies on STEM Clubs, with quantitative methods including descriptive statistics and inferential statistics being used to a similar extent. The preference for descriptive analysis may arise from its effectiveness in describing activities, experiences, and practices within STEM Clubs. Given the predominance of case study research in the analysed studies, it is not surprising to observe a high frequency of descriptive statistics in the findings. On the other hand, the extensive use of quantitative analysing methods can be attributed to the need for statistical analysis of surveys and questionnaires (Young, 2015). Consequently, future studies on STEM Clubs could benefit from considering the use of tests and field notes as additional data collection tools, along with surveys, observations and interviews. Additionally, the development of tests specifically designed to assess aspects related to STEM could provide valuable insights (Capraro & Corlu, 2013; Grangeat et al., 2021). Moreover, increasing the utilization of content analysis and constant comparative analysis methods could further enhance the depth and richness of data analysis in STEM Club research (White & Marsh, 2006). In the studies on STEM Clubs, the duration and scheduling of the clubs varied considerably. While there was no common period of time for STEM Clubs, they were implemented for different numbers of weeks and sessions, with session durations ranging from several minutes to 60 to 120 min. However, it was observed that STEM Clubs were predominantly conducted over the course of three semesters, including the academic year and summer, or for durations of 2 to 16 weeks. This scheduling pattern can be attributed to the fact that STEM Clubs were often implemented as after-school programs, and they were designed to align with the academic semesters and summer school periods to effectively reach students. Additionally, the number of weeks in these studies may have been arranged according to the duration of academic semesters, although some studies were conducted for less than a semester (Gutierrez, 2016). The most common use of multiple sessions with a time range of 60 to 120 min can be attributed to the nature of the activities involved in STEM Clubs. These activities often require more time than regular class hours, and splitting them into separate sessions allows students to effectively concentrate on their work and engage in more in-depth learning experiences (Vennix et al., 2017).

The purposes of the studies on STEM Clubs were mostly related to effects of participation in STEM Clubs on various aspects such as attitudes towards STEM disciplines or career paths, STEM major choice/career aspiration, achievement etc., evolution of a sample program for STEM Clubs and its implementation including the development of program/activity, identification of program's challenges and limitations, and implementation of it, followed by the examination of certain aspects such as the experiences and perceptions of students and the factors influencing specific subjects, identification of such as the types of attitudinal effects and non-academic skills, and comparison of in-school and out-school STEM experiences. Therefore, the results of the studies parallel to the purposes were mostly related to development of or increase in certain aspects such as STEM or academic achievement or STEM competency STEM major choice or career aspiration engagement or participation in STEM Clubs, identity, interest in STEM, enjoyment, collaboration, communication skills, critical thinking, the design of STEM Clubs including the sample implementation or design model for different purposes such as the usage of robotic program or students with disabilities, design principles or ideas for STEM clubs or activities, challenges or factors effecting STEM Clubs success and sustainability, and the comparison between in-school and out-of-school learning environments. Also, they are related to the identification of various aspects such as factors affecting participation or motivation to STEM clubs, barriers to participation. At this point, it is evident that these identified categories align with the findings of studies in the literature. These studies claim that after-school programs, such as STEM Clubs, have positive impacts on students' achievement levels (NRC, 2015; Kazu & Kurtoglu Yalcin, 2021; Shernoff & Vandell, 2007), communication, and innovative problem-solving abilities (Mahoney et al., 2007), leadership skills (Lipscomb et al., 2017), career decision-making (Bybee, 2001; Dabney et al., 2012; Sahin et al., 2018; Tai et al., 2006), creativity (Wan et al., 2023), 21st-century skills (Hirsch, 2011; Zeng et al., 2018), interest in STEM professions (Blanchard et al., 2017; Chittum et al., 2017; Wang et al., 2011), and knowledge in STEM fields (Adams et al., 2014; Bell et al., 2009). Furthermore, it can be inferred that the studies on STEM Clubs paid significant attention to the design descriptions of programs or activities (Nation et al., 2019). This may be because there is a need for studies that focus on designing program models for different cases (Calabrese Barton & Tan, 2018; Estrada et al., 2016). These studies can serve as examples and provide guidance for the development of STEM clubs in various settings. By creating sample models, researchers can contribute to the improvement and expansion of STEM clubs across different environments (Cakir & Guven, 2019; Estrada et al., 2016).

In conclusion, as the studies on the trends in STEM education (Bozkurt et al., 2019; Chomphuphra et al., 2019; Irwanto et al., 2022; Li et al., 2020; Lin et al., 2019; Martín-Páez et al., 2019; Noris et al., 2023), the analysis of prevailing research trends specifically in STEM Clubs, which are implemented in diverse environments with varying methods and purposes, can provide a comprehensive understanding of these clubs as a whole.

It can also serve as a valuable resource for guiding future investigations in this field. By identifying common approaches and identifying gaps in methods and results, a holistic perspective on STEM Clubs can be achieved, leading to a more informed and targeted direction for future research endeavours.

Recommendations

Future research on STEM Clubs should consider the trends identified in the study and address methodological gaps. For instance, there is a lack of research in this area that employs quantitative approaches. Therefore, it is important for future studies to incorporate quantitative methods to enhance the understanding of STEM Clubs and their impact. This includes exploring underrepresented populations, investigating the long-term impacts of STEM Clubs, and examining the effectiveness of specific pedagogical approaches or interventions within these clubs. Researchers should conduct an analysis to identify common approaches used in STEM Clubs across different settings. This analysis can help uncover effective strategies, best practices, and successful models that can be replicated or adapted in various contexts. By undertaking these efforts, researchers can contribute to a more comprehensive understanding of STEM Clubs, leading to advancements in the field of STEM education.

Limitations

It is important to consider the limitations of the study when interpreting its findings. The study's findings are based on the literature selected from two databases, which may introduce biases and limitations. Additionally, the study's findings are constrained by the timeframe of the literature review, and new studies may have emerged since the cut-off date, potentially impacting the representation and generalizability of the research trends identified. Another limitation lies in the construction of categories during the coding process. The coding scheme used may not have fully captured or represented all relevant terms or concepts. Some relevant terms may have been inadequately represented or identified using different words or phrases, potentially introducing limitations to the analysis. While efforts were made to ensure validity and reliability, there is still a possibility of unintended biases or inconsistencies in the categorization process.