1 Introduction

In our data-driven and increasingly digitalized world, competent handling of data and statistical information is of great importance (Chalkiadaki, 2018; Rubin & Gould, 2023). This has led to calls for adaptations of school curricula all over the world, with an increased focus on promoting statistical literacy and data literacy (SL/DL) in K-12 education (Burrill, 2023; Pfannkuch, 2018). Despite its crucial role, SL/DL is typically not offered as an independent subject in the school curriculum. Instead, it is often taught implicitly within the context of STEM (science, technology, engineering, and mathematics) subjectsFootnote 1 (Khan & Mason, 2021). This is due to the overarching interdisciplinary character of these competencies, stemming from the recognition that data are numbers that have inherent contextual meaning (Cobb & Moore, 1997). Scientific methods and processes are essential to understand how and in which context data is generated and analyzed (Jho, 2023). Recent endeavors in statistics and data science education research aim to merge STEM perspectives in the teaching and learning of SL/DL, reflecting their overarching interdisciplinary nature (Vance et al., 2022).

Current recommendations for SL/DL teaching suggest a data-oriented approach in which students learn to deal with data in meaningful and authentic contexts and to adopt a critical attitude toward analyzing and interpreting data (Batanero et al., 2011; Burrill, 2023). In addition, there is a growing focus on integrating real-world, often large and complex data sets into STEM education (Kjelvik & Schultheis, 2019; Wolff et al., 2019). These endeavors that increasingly revolve around the paradigm of "big data” (Monteiro & Carvalho, 2023) are accompanied by the adoption of new technologies to enhance SL/DL instruction (Biehler et al., 2022). In many countries, however, there is a large gap between written curriculum recommendations and standards for SL/DL instruction and school reality (Burrill, 2023). This highlights the increasing importance of adequately educating and professionally training STEM teachers, who play a central role in preparing students to become statistical and data literate.

Teacher variables related to SL/DL deserve attention, since they are assumed to be among the key factors that determine students’ learning success in this area (Batanero & Díaz, 2010). This includes knowledge of content and pedagogy related to SL/DL as well as competence to use appropriate technology (Batanero et al., 2011; Burrill, 2023). However, many teachers seem to have relevant knowledge gaps related to SL/DL and unconsciously share a variety of difficulties and misconceptions with their students about fundamental statistical concepts (cf. Batanero et al., 2011). This is a cause for concern as teachers’ instructional decisions in the classroom may be heavily influenced by this knowledge (Suh et al., 2020). In addition, teachers often lack confidence in teaching content related to SL/DL (e.g., de Souza et al., 2014), which may be due to a lack of specific learning opportunities during teacher education and their professional development (Batanero & Díaz, 2010).

Due to recent calls to promote SL/DL in K-12 STEM education, increasing attention is being paid to teachers’ cognitive and affective variables (e.g., attitudes, beliefs, and knowledge) related to SL/DL, and how these affect the way they teach children (e.g., Garfield & Ben-Zvi, 2008). Likewise, research in the education and professional development of teachers to teach for SL/DL in their classroom is expanding (e.g., Biehler et al., 2022; Burrill et al., 2023; Groth & Meletiou-Mavrotheris, 2018). The increasing emergence of research on SL/DL across different STEM domains comes with challenges for researchers and teacher educators, making it difficult to keep track of its use, potential, and specific challenges that need to be addressed in further research and teacher education. A systematic and comprehensive review that sheds light on these topics does not yet exist. The current paper seeks to fill this gap.

2 Background: definition and conceptualization of statistical and data literacy

Over the past decades, the importance of knowledge and skills in the context of SL/DL have become increasingly important (e.g., Wallman, 1993). In the current literature, SL and DL are not always consistently distinguished from each other regarding the terms used and the sub-competencies included. A well-known definition by Wallman (1993, p. 1) defines SL as “the ability to understand and critically evaluate statistical results that permeate our daily lives”. The ability to communicate about statistical information and messages was later added as an important element by Gal (2002). While SL is mostly limited to the skills needed to be a recipient of data, definitions of DL also include skills needed to be a user of data, such as data collection and creating data representations (e.g., Gould, 2017). In addition, recent research also includes aspects of data protection and data ethics (Schüller et al., 2019; Ridsdale et al., 2015).

At this point, however, SL/DL should be distinguished from other terms that address similar constructs and with which there are some connections and overlaps. Information literacy is defined as the ability to find, use, and evaluate information effectively from a variety of formats (Carlson & Johnston, 2015). The three terms appear to converge in certain addressed skills, but they differ in that SL/DL focus exclusively on information extracted or derived from data (Heidrich et al., 2018). Other terms, such as digital literacy and data science literacy have been coined that increasingly integrate the use of computers and new technologies such as artificial intelligence and put an emphasis on competencies in IT domains (Dichev & Dicheva, 2017). Scientific literacy is defined by three competencies of explaining phenomena scientifically, evaluating and designing scientific enquiry, and interpreting data and evidence scientifically (OECD, 2019). As working with data is an essential part of this definition it becomes obvious that SL/DL overlap to some extent with scientific literacy.

For the purposes of this manuscript, we refer to the definitions of SL/DL described above and focus on teachers in K-12 STEM education. This should be explicitly distinguished from the construct data literacy for teaching, where DL is conceptualized in the context of school development (Gummer & Mandinach, 2015).

3 Goals and research questions

The main objectives of this systematic review were to compile, systematize, and interpret relevant findings from international research on teachers’ SL/DL in K-12 STEM education. The following research questions were addressed:

RQ1. What meta-information (year of publication, literacy construct, STEM domain, sample) is included in the studies?

RQ2. Which teacher variables (cognitive, affective) are empirically investigated and how do these affect the way they teach for SL/DL?

RQ3. What pedagogical approaches were empirically investigated to prepare teachers to promote SL/DL in their classrooms and what is their impact on teachers’ classroom practice?

4 Method

This systematic review is part of a larger research project that was pre-registered at the open-source platform OSF (Friedrich et al., 2021). The review was planned, conducted, and reported according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines (Page et al., 2021). PRISMA is a set of guidelines developed to improve the reporting of systematic reviews and meta-analyses. In essence, it consists of a checklist that provides guidelines for each component of the review report (Title, Abstract, Introduction, etc.), which should also be used as a reference during the conduct of the review to ensure good, transparent research. Since no meta-analysis was conducted, we have omitted all points related to the presentation of quantitative values and bias assessment. A flowchart summarizing and illustrating the literature selection process is also an integral component of a systematic review when adhering to the PRISMA guidelines. The PRISMA flowchart presented in Fig. 1 provides an overview of the individual steps of the systematic search.

Fig. 1
figure 1

PRISMA flow chart

4.1 Paper selection

A broad systematic search of the three literature databases PsychInfo, ERIC, and Web of Science, which are among the most common sources for peer-reviewed empirical and educational research, yielded 18.243 records. Because the range of terms used to describe the constructs of SL/DL is wide (cf. Sect. 2), we created a broad search string. The search string included the terms (“data”, “quantitative”, “probabil*”, “data oriented”, “data based”, “statistic*), supplemented with the keywords (“literacy”, “literate”, “reason*”, “competen*”, “think”, “understand”, “comprehend*”, “argu*”, “mastery”) in every possible combination and order, as well as the terms (“data and chance”, “uncertainty and data”) in every possible order. All combinations were connected using the Boolean operator “or”. Furthermore, to consider relevant recent studies that have not yet been published in scientific journals, proceedings of 16 international conferences related to STEM education—mostly relevant according to an internal expert rating—were included in the systematic search. Only proceedings from the last three years (2019, 2020, 2021) containing full papers were considered. All literature search results were merged, and duplicates removed.

For the screening process, seven eligibility criteria were defined that all included studies must meet: (1) abstract in English, (2) original research, (3) peer-reviewed research, (4) investigation scope relates to school education (K-12), (5) investigation scope relates to STEM domains, (6) investigation scope relates to SL/DL (at least one of the terms must appear in the main text), and (7) at least one teacher variable is included.

First, a title-abstract screening against eligibility criteria was performed by two researchers independently using the open-source machine learning software ASReview (Utrecht University, 2021). The software provides active learning algorithms to increase efficiency and reduce human error. Interrater reliability was moderate (Cohen’s kappa = .57). Discrepancies were resolved by a third researcher. After this screening, 594 studies remained.

Next, full-text articles for these studies were retrieved and screened against eligibility criteria, again independently by two researchers (high interrater reliability; Cohen's kappa = .93). This step led to the exclusion of 529 studies with the exclusion reasons presented in Fig. 1. Discrepancies were discussed and resolved in the review team.

In a final step, forward snowball searches (cf. Wohlin, 2014) were performed for each included study, adding another 11 studies. Eventually, 42 studies were included in the review.

4.2 Coding and data extraction

In each eligible full-text, findings relevant to the research questions were coded, extracted, and synthesized. For the research question presented in this review, codes were used in the following categories: (1) publication year, (2) literacy construct (SL/DL), (3) STEM domain, (4) sample (pre-service/in-service teachers in primary/lower secondary/upper secondary education) and sample size, (5) teacher variables related to SL/DL, (6) pedagogical approach to support (prospective) STEM teachers to promote SL/DL in their classrooms.

5 Results

5.1 Overview of meta-information

Table 1 provides an overview of all 42 studies focusing on teachers’ SL/DL in K-12 STEM education. Of these, 36 studies addressed only SL, four studies addressed only DL, and two studies addressed both constructs. The analysis of publication years revealed that empirical research on STEM teachers’ SL/DL began in 2009, with a substantial increase in published studies in 2019/2020. The subject of mathematics dominated the studies, with 88% of them focusing exclusively on mathematics. The remaining 12% examined multiple STEM domains, including natural sciences (physics, chemistry, biology) and computer science. The studies encompassed pre-service and in-service teachers from various school levels, including primary, lower secondary, and upper secondary education. The sample sizes varied widely, ranging from N = 1 in some studies (e.g., Meletiou-Mavrotheris et al., 2019) to N = 531 (Wu et al., 2021). Figure 2 illustrates the distribution of participating teachers across different school levels, with lower secondary education having the highest representation, followed by upper secondary and primary education. Notably, slightly more studies were conducted with in-service teachers (56%) than with pre-service teachers (44%). Geographically, the studies were conducted worldwide, spanning countries in Africa, America, Asia, Australia, and Europe (see Table 1).

Fig. 2
figure 2

Overview of included studies with regard to specific teacher sample characteristics

Table 1 Summary of included studies on STEM teachers’ statistical and data literacy

5.2 Teacher variables related to SL/DL

To address RQ2, we examined teacher variables related to SL/DL, which can be categorized into cognitive variables and affective variables.

5.2.1 Teachers’ cognitive variables related to SL/DL

A total of 39 studies empirically investigated cognitive variables, encompassing specific knowledge components or conceptions of teachers related to SL/DL and their statistical thinking or reasoning (see Table 1).

Across all studies, seven standardized test instruments were used: LOCUS (Levels of Conceptual Understanding in Statistics) was used in the studies of Engledowl and Tarr (2020), and Suh et al. (2020), CAOS (Comprehensive assessment of outcomes in a first statistics course) was used in the study of Hannigan et al. (2013) and Lee et al. (2013), DTAMS (Diagnostic Teacher Assessment of Mathematics and Science) was used in the study of Schoen et al. (2019), SRA (Statistical Reasoning Assessment) was used in the study of Karatoprak et al. (2015), GOALS (Goals and Outcomes Associated with Learning Statistics) was used in the study of Engledowl and Tarr (2020), and BeSt Teacher scales were used in the study of Scheuerer et al. (2019). The remaining studies utilized self-developed assessment instruments, frameworks, or qualitative procedures.

Both pre-service and in-service teachers demonstrated substantial knowledge gaps related to SL/DL. Regarding pre-service teachers, Hannigan et al. (2013) found poor conceptual knowledge in fundamental areas of statistics, such as interpreting centrality and variability in box plots. Thanheiser et al. (2011) discovered that only half of the participants recognized the mean as an appropriate measure of center when comparing data distributions of unequal size. Additionally, Guven et al. (2021) reported generally low levels of statistical literacy among pre-service teachers.

For in-service teachers, Meletiou-Mavrotheris et al. (2009) observed that most participants lacked a global view of distributional features and avoided considering variation when interpreting data distributions. Hobden (2014) categorized approximately half of the in-service teacher sample below the level of basic understanding of the median, and Wessels and Nieuwoudt's (2013) study indicated a lack of understanding of variability in a repeated sampling context. Muñiz-Rodríguez et al. (2020) found that only 50% of the participating in-service teachers self-assessed their knowledge as quite or very suitable for promoting students’ SL.

Several studies reported significant differences in knowledge levels among the participants within their respective samples, both for pre-service and in-service teachers. Wu et al. (2021) identified substantial variations in knowledge related to DL among 531 participating pre-service teachers, particularly regarding data extraction, representation, and interpretation. Similarly, Zhang and Stephens (2016) detected large knowledge differences in statistics and probability among 82 in-service teachers. Rowe et al. (2020) found differences in domain-specific knowledge to teach DL skills between high school, middle school, and elementary school teachers.

Only two studies focused on the relationship between teacher knowledge and their classroom teaching and students’ learning (Batiibwe, 2019; Callingham et al., 2016). The results indicated that teachers’ pedagogical content knowledge for teaching statistics was associated with their students’ learning outcomes as they developed an understanding of statistical concepts (Callingham et al., 2016). Batiibwe (2019) observed a predominant use of teacher-controlled teaching with an emphasis on formulas and computations, leaving little room for statistical reasoning. These observations were associated by the authors with teachers' self-assessed uncertain knowledge of teaching statistics.

Several studies found that both pre-service and in-service teachers faced difficulties regarding statistical thinking or statistical reasoning. For example, pre-service teachers had difficulties in predicting a graph of a distribution when given only statistical measures and exhibited a tendency to apply equiprobable reasoning in comparing likelihood of results from different sample sizes (Lee et al., 2013). Additionally, studies conducted with pre- and in-service teachers found that many teachers struggled with statistical judgment and communication, focusing more on procedural aspects of statistics (Chesler, 2015; Savard & Manuel, 2016). These results were in line with Aizikovitsh-Udi et al. (2014), Kuntze et al. (2017), and Kus and Çakiroglu (2020), who reported on the design and potential of hybrid tasks to connect learning opportunities related to critical and statistical thinking. Karatoprak et al. (2015) found that pre-service teachers were successful in understanding independence and the importance of large samples; however, they exhibited equiprobability bias, law of small numbers, and representativeness misconceptions.

5.2.2 Teachers’ affective variables related to SL/DL

Seven studies empirically investigated STEM teachers’ affective variables related to SL/DL (see Table 1). These studies differ greatly in terms of the constructs addressed and the assessment instruments used.

Hannigan et al. (2013) and Leavy et al. (2019) measured pre-service teachers’ attitudes towards statistics using the SATS-36 scale (survey of attitudes towards statistics). Both studies revealed a generally positive attitude among the participants towards statistics. They valued the subject, expressed interest in it, and demonstrated confidence in their own knowledge and skills. However, Hannigan et al. (2013) aimed to explore the relationship between prospective teachers' attitudes towards statistics and their conceptual understanding of statistics. The results showed no strong correlation, implying those prospective teachers "may have positive attitudes towards statistics and, at the same time, possess poor conceptual understanding" (Hannigan et al., 2013, p. 444).

Rowe et al. (2020) investigated how in-service teachers perceived the pedagogical value of using cross-reality tools for data visualization, using questions adapted from the ARS Heuristic. Findings indicated that all participants showed relatively high interest and saw value in using these innovative techniques for teaching and learning.

North et al. (2014) and Reston and Loquias (2018) measured in-service teachers’ level of confidence in teaching statistics using self-developed rating scales. While the majority of the participants in North et al. (2014) study rated their confidence in teaching statistics as average or good, most participants in Reston and Loquias’s (2018) study demonstrated a lack of confidence.

Scheuerer et al. (2019) investigated the relationship between in-service teachers cognitive and affective variables related to statistics, as measured by modified BeSt Teacher scales. The findings revealed that teachers’ motivational and emotional orientations toward teaching statistics (e.g., self-efficacy, anxiety) were related to their statistical content knowledge. According to the authors, this underscores the importance of addressing teachers' anxiety and building their self-efficacy already during teacher training in statistics (Scheuerer et al., 2019).

Umugiraneza et al. (2022) found that in-service teachers’ confidence in teaching mathematics concepts differed from their confidence in teaching statistics concepts and those which require critical thinking skills, as measured by self-developed questionnaires.

5.3 Pedagogical approaches in teacher education and their professional development

5.3.1 Overview of pedagogical approaches

Fifteen studies reported empirical findings on pedagogical approaches applied in teacher education and professional development (Table 2). Eleven studies focused exclusively on mathematics, while the other four studies were interdisciplinary and concentrated on mathematics, nature science, and computer science (Table 2). Across domains and target groups (pre-service and in-service teachers), similar approaches were identified to support teachers to promote SL/DL in their classrooms. These are presented below.

Table 2 Summary of included studies on pedagogical approaches to prepare teachers to promote SL/DL in their classrooms
5.3.1.1 Promoting teachers’ knowledge related to SL/DL by connecting teacher education to classroom teaching practice

Several studies focused on an integrated support of teachers’ content knowledge—i.e. their own levels of SL/DL—and pedagogical content knowledge, including knowledge of common student (mis)conceptions and effective teaching strategies. This approach was implemented for pre-service teachers in studies by Bilgin et al. (2017), Leavy et al. (2019), Metz (2010), Reisoğlu and Çebi (2020), and Suh et al. (2020), and for in-service teachers in studies by Meletiou-Mavrotheris et al. (2009), Reston and Loquias (2018), Schoen et al. (2019), Wessels and Nieuwoudt (2013), and Wessels (2014). It involved exposing teachers to similar learning activities, resources, and contexts that they should employ in their classrooms. Furthermore, supportive trainings for lesson design, planning, and hands-on techniques to teach for SL/DL in the classroom were provided to closely connect teacher education to classroom practice (Chick & Pierce, 2012; Dolor & Noll, 2015; Giamellaro et al., 2020; Meletiou-Mavrotheris et al., 2019; Sanchez-Cruzado & Sanchez-Compana, 2020).

5.3.1.2 Engaging teachers with real-world data and statistical investigations

Engaging teachers with authentic and complex real-world data was a commonly used approach in most studies, both with pre-service and in-service teachers. Teachers were given the opportunity to experience various components of a statistical investigation process suitable for classroom instruction, including posing problems, collecting data, analyzing data, making conclusions and predictions (Bilgin et al., 2017; Leavy et al., 2019; Meletiou-Mavrotheris et al., 2009; Metz, 2010; Suh et al., 2020; Wessels, 2014; Wessels & Nieuwoudt, 2013). The interventions in teacher education and professional development were often structured based on frameworks like Wild and Pfannkuch's (1999) PPDAC (Problem, Plan, Data, Analysis, Conclusion) cycle and Bybee (2009)’s inquiry-based approach to science education. Reflections on the theoretical basis and rationale of the approach from both a learner and teacher perspective were incorporated (Wessels, 2014).

5.3.1.3 Working with technology

Teachers were prepared on various ways to effectively use technology to teach for SL/DL in their classrooms. Online platforms such as Gapminder were used as a source for real-world datasets that can be included in instruction (Bilgin et al., 2017). Specialized software like Microsoft Excel and R were employed in data preparation and analysis activities to support data exploration, develop conceptual understanding, and facilitate communication by creating graphical representations, developing models through simulations, or calculating descriptive statistics (Leavy et al., 2019; Metz, 2010; Reisoğlu & Çebi, 2020; Schoen et al., 2019; Suh et al., 2020; Wessels, 2014). Some studies incorporated software specifically developed for teaching content related to SL/DL in the classroom, such as TinkerPlots (Meletiou-Mavrotheris et al., 2009; Wessels & Nieuwoudt, 2013). Teachers were provided opportunities to observe, reflect, and actively experience how technology can be used in teaching activities to enable learners, even at an early age, to discover trends, patterns, and deviations from patterns in data (Meletiou-Mavrotheris et al., 2009; Reisoğlu & Çebi, 2020). Additionally, one study instructed in-service teachers on how to use digital games to enhance the learning of key statistical concepts in the early years of primary school education (Meletiou-Mavrotheris et al., 2019).

5.3.1.4 Promoting interdisciplinary and collaborative work among teachers

Five studies implemented interdisciplinary approaches. Giamellaro et al. (2020) adopted a narrative teaching method coupled with teacher-scientist partnerships, allowing in-service teachers to learn how to incorporate authentic data within interdependent phenomena. Bilgin et al. (2017) developed an inquiry-based approach that aimed to support pre-service teachers to use authentic data for instruction in mathematics and nature science classes. Reisoğlu and Çebi (2020) conducted a study involving pre-service teachers from nature science and computer science disciplines, demonstrating the effectiveness of collaboration across disciplines in acquiring knowledge and skills related to DL. Schoen et al. (2019) reported on a professional development intervention for in-service mathematics and science teachers with limited formal training in statistics. In Meletiou-Mavrotheris et al.’s (2009) study, in-service mathematics teachers were introduced to dynamic statistics environments, showcasing their integration and connection to other STEM subjects (e.g., geography, science) to incorporate data analysis into the classroom within relevant and meaningful contexts.

Furthermore, numerous approaches have been developed based on frameworks and guidelines, such as the Guidelines for Assessment and Instruction in Statistics Education (GAISE) (Chick & Pierce, 2012; Dolor & Noll, 2015; Leavy et al., 2019; Metz, 2010; Schoen et al., 2019; Suh et al., 2020), the Common Core State Standards in Mathematics (CCSS-M) (Schoen et al., 2019), the European Framework for the Digital Competence of Educators (DigCompEdu) (Reisoğlu & Çebi, 2020), and the National Council of Teachers of Mathematics (NCTM) standards for data analysis and probability (Meletiou-Mavrotheris et al., 2009; Metz, 2010; Schoen et al., 2019).

5.3.2 Impact of the applied pedagogical approaches

Most studies applied a combination of the above-described pedagogical approaches (Table 2). To assess the impact of these approaches, predominantly qualitative procedures were employed. These encompassed the analysis of observed, audio- or videotaped interviews, focus groups, or classroom activities (Dolor & Noll, 2015; Giamellaro et al., 2020; Meletiou-Mavrotheris et al., 2009, 2019; Reisoğlu & Çebi, 2020) as well as the evaluation of teachers' work samples, including written responses, lesson plans, teaching diaries, and lesson reflection papers (Chick & Pierce, 2012; Meletiou-Mavrotheris et al., 2019; Reisoğlu & Çebi, 2020; Sanchez-Cruzado & Sanchez-Compana, 2020).; Wessels, 2014; Wessels & Nieuwoudt, 2013). Six studies adopted a quantitative approach. Of these, three studies used self-developed assessment instruments for teachers to evaluate the effectiveness of an intervention, assessing factors like perceived knowledge enhancement (Bilgin et al., 2017; Metz, 2010) or increased confidence in teaching statistics (Reston & Loquias, 2018). Two studies employed a repeated measures design (Leavy et al., 2019; Suh et al., 2020), while only one study conducted a controlled field trial (Schoen et al., 2019).

The majority of the studies examined the impact of the applied pedagogical approaches on teachers’ cognitive variables. Many of these studies reported positive effects on teachers’ knowledge, thinking, and reasoning skills related to SL/DL (Bilgin et al., 2017; Meletiou-Mavrotheris et al., 2009; Metz, 2010; Reisoğlu & Çebi, 2020; Schoen et al., 2019; Suh et al., 2020; Wessels, 2014; Wessels & Nieuwoudt, 2013). Additionally, positive effects were noted on teachers’ affective variables, including increased confidence or attitudes towards teaching topics related to SL/DL (Bilgin et al., 2017; Leavy et al., 2019; Reston & Loquias, 2018; Wessels, 2014) and increased interest in actively engaging with data-related problems (Meletiou-Mavrotheis et al., 2009). However, the impact of the applied pedagogical approaches on teachers' instructional practices and their students' learning has received limited investigation thus far. Only one study, a case study with one in-service teacher (Meletiou-Mavrotheris et al., 2019), focused on the extent of knowledge and skill transfer from a teaching intervention into actual teaching practice. Classroom observation and lesson plan analysis indicated that the specific educator acquired the necessary skills for teaching statistical topics using tablets and game apps during the intervention (Meletiou-Mavrotheris et al., 2019). In addition, one study (Chick & Pierce, 2012) examined the impact of the applied approach on pre-service teachers’ lesson planning, revealing a positive impact on the quality of produced lessons.

6 Discussion

This systematic review aims to compile, systematize, and interpret relevant findings from international research on teachers’ SL/DL in K-12 STEM education. The results are discussed in order of the three research questions.

6.1 Comprehensive overview of studies on teachers’ SL/DL in K-12 STEM education

Aligned with our initial research inquiry, we present a comprehensive overview of the incorporated studies, encompassing aspects such as publication year, literacy constructs, STEM domains, and specific sample characteristics. Our findings underscore the relatively recent empirical exploration of STEM teachers’ SL/DL, with studies emerging since 2009, primarily focusing on SL as opposed to DL. This temporal progression may be attributed to the historical emergence of these terms. While both SL and DL were introduced around the same time in the 2000s, SL appears to have taken precedence in research (e.g., Gould, 2017; Schüller et al., 2019). This observation is supported by an ongoing systematic review that confirms the chronological sequence (Friedrich et al., 2021). Furthermore, our analysis reveals a predominant emphasis on mathematics (88%) in the investigated studies, with fewer inquiries extending to encompass broader STEM domains, including the natural sciences (physics, chemistry, biology) and computer science. This skew might be influenced by variations in the sizes of distinct scientific communities, potentially contributing to a disparity in research output across domains. Regarding sample characteristics, our synthesis indicates a prevalence of lower secondary education for participating teachers, followed by upper secondary and primary education. Moreover, in-service teachers were slightly more frequently represented in the studies compared to their pre-service counterparts.

6.2 Teachers’ variables related to SL/DL and their impact on teaching practice

Addressing our second research question, we focused on teacher variables related to SL/DL. These deserve special attention, since they are postulated to be among the key factors of students' progress in acquiring statistical and data literacy (Batanero & Díaz, 2010). Our analysis underscores the predominance of studies examining cognitive variables, including teachers’ knowledge, statistical thinking, and reasoning. These investigations consistently reveal relevant knowledge gaps across both pre-service and in-service teachers, as well as large differences in knowledge among teachers. The range of identified difficulties encompass a limited understanding of fundamental statistical measures such as mean and median, challenges in interpreting centrality and variability in statistical graphs, struggles in comparing data distributions of varying sizes, and poor statistical thinking or reasoning. These findings confirm that many teachers share a variety of typical student difficulties and misconceptions (cf. Batanero et al., 2011), highlighting the need to enhance STEM teachers’ SL/DL-related competencies through targeted interventions during teacher education and professional development.

Compared to cognitive variables, teachers’ affective variables have received limited attention, with only seven studies addressing this facet. These studies display considerable diversity, addressing various constructs such as attitudes, perceived value, self-efficacy, anxiety, and confidence in teaching SL/DL-related topics. Furthermore, these investigations employ a range of assessment instruments and yield conflicting outcomes. For instance, varying levels of confidence among in-service teachers were reported across different inquiries (North et al., 2014; Reston & Loquias, 2018) , further underscoring the intricate nature of these findings. This diversity in constructs, methodologies, and assessment tools introduces a challenge for researchers and educators seeking to compare outcomes and pinpoint specific areas requiring attention and intervention.

Notably, the majority of studies primarily focus on cognitive or affective variables in isolation. Only two studies in this review examined the interplay between teachers’ content knowledge and either their attitudes towards statistics (Hannigan et al., 2013) or their self-efficacy and anxiety towards teaching statistics (Scheuerer et al., 2019). Two further studies focused on the relationship between teachers' knowledge and their instructional practices and their students' learning. These studies point to associations between teachers’ pedagogical content knowledge for teaching statistics and their students’ learning outcomes (Callingham et al., 2016) as well as the quality of their classroom instruction (Batiibwe, 2019). However, comprehensive investigations that examine the complex interplay between SL/DL-related teacher variables, instructional practices, and student outcomes are so far very rare. This underscores the need for future research that adopts an overarching framework to systematically explore the complex dynamics among these variables and their potential impact on promoting students' SL/DL. Such endeavors will offer valuable insights into effective strategies for enhancing SL/DL in K-12 STEM education.

6.3 Pedagogical approaches in teacher education and their professional development

6.3.1 Overview of pedagogical approaches

Focusing on the third research question, the findings revealed a range of pedagogical approaches employed to prepare STEM teachers for teaching SL/DL in their classrooms. Across various STEM domains and target groups, including both pre-service and in-service teachers, these approaches share common threads in their effort to empower educators. Nevertheless, it is central to note that the approaches were conducted in different operational contexts and by different actors and systems depending on the target group. Pre-service teachers undergo their preparation predominantly within the university setting, while in-service teachers engage in professional development and training within the school systems or within the scope of responsibility of educational institutions that are tailored to the specific requirements of each group. Thus, although similar pedagogical approaches are used in teacher education and professional development, the implementation of these approaches may vary greatly depending on the prior knowledge, teaching experience, and immediate teaching context of (prospective) teachers.

One recurring approach involved enhancing teachers' content and pedagogical content knowledge related to SL/DL. Emphasis was placed on deepening understanding of statistical concepts, common student (mis)conceptions, and effective pedagogical strategies to ensure that teachers are prepared to provide comprehensive and engaging instruction in the field. Pre-service and in-service teachers were exposed to similar learning activities, resources, and contexts that mirror those they should employ in their classrooms. This integration allows teachers to bridge the gap between theoretical knowledge and real-world implementation, empowering them to apply SL/DL concepts effectively within their instructional contexts.

Engaging teachers with real-world data and statistical investigations emerged as another crucial pedagogical approach. Teachers posed problems, collected and analyzed data, made predictions, and drew conclusions—all elements conducive to classroom instruction. This approach often adhered to established frameworks like PPDAC (Wild & Pfannkuch, 1999) and inquiry-based approaches (e.g., Bybee, 2009).

Several studies within our review emphasized the effective use of technology to facilitate SL/DL instruction. Notably, technology served a dual purpose, serving both a source for large and complex real-world datasets and a tool to explore, analyze, and visualize data. This integration of technology in teacher education and training has facilitated dynamic and interactive learning experiences, fostering a deeper understanding of SL/DL concepts. These endeavors are in line with emerging issues in statistics and data science education research that increasingly revolve around the paradigm of "big data” (Biehler et al., 2022) and the accompanying “need to confront the challenge of preparing teachers to maximize the potential of technology in their classrooms” (Burrill, 2023, p. V). Accordingly, technology as an integral part of teacher preparation and professional development is considered essential to equip educators with the knowledge and skills needed to foster SL/DL among their students.

In an effort to bridge STEM disciplines, some studies advocated for interdisciplinary and collaborative approaches in teacher education and training. Encouraging collaboration across subject areas and engaging in interdisciplinary projects enabled teachers to explore diverse applications of SL/DL in various contexts, enriching their overall teaching practices. This approach aligns with current efforts that aim to merge STEM perspectives in the teaching and learning of SL/DL, reflecting their overarching interdisciplinary nature (Vance et al., 2022).

Many of the pedagogical approaches are based on established curriculum-relevant frameworks and guidelines. Notably, the adoption of guidelines such as the GAISE guidelines, the CCSS-M standards, the DigCompEdu framework, and the NCTM standards for data analysis and probability provided a structured and evidence-based foundation for SL/DL instruction.

6.3.2 Impact of the applied pedagogical approaches

The impact of these approaches was primarily assessed in terms of teachers' cognitive and affective variables. Many studies reported positive effects on teachers' knowledge, thinking, and reasoning skills related to SL/DL, and their confidence and attitudes towards teaching SL/DL topics. Additionally, there was a noticeable increase in teachers' interest in actively engaging with data-related problems. These effects were reported for both pre-service and in-service teachers.

To examine the effectiveness of these pedagogical approaches, mostly qualitative assessment procedures were employed. These included the analysis of observed classroom activities, interviews, and teachers' work samples. Quantitative approaches were mainly limited to an evaluation of teachers’ self-perceived knowledge or confidence increase at the end of an intervention. Only one study conducted a controlled field trial (Schoen et al., 2019) to examine the effectiveness of an intervention on teachers’ knowledge and confidence in teaching statistics. Longitudinal impacts of the pedagogical approaches remained largely unexplored. Additionally, it is important to note that the impact of these pedagogical approaches on teachers’ classroom practices, and consequently on students’ learning, remains a research area that requires further exploration. In this review, only one case study (Meletiou-Mavrotheris et al., 2019) conducted with one in-service teacher was found that focused on the level of transfer from a teaching intervention into actual teaching practice. Thus, the findings of this review clearly point to a need for further research to investigate the effectiveness of these approaches within controlled experimental designs and large upscaled implementation studies, considering longitudinal effects and the impact on teachers’ actual classroom practice.

Another notable observation is the apparent lack of studies examining systemic issues associated with preparing in-service teachers to teach SL/DL. For example, none of the included studies focused on how public education (K-12) school systems and ministries of education approach the need to build teachers' instructional skills in SL/DL. Exploring the complexities of policies, textbook development processes, or large-scale professional development programs aimed at promoting SL/DL among teachers could yield valuable insights for enhancing the overall quality of SL/DL instruction. Future research in this area should adopt a comprehensive and interdisciplinary approach to delve deeper into the multifaceted aspects of SL/DL education.

6.4 Limitations of the systematic review

It should be noted that this systematic review cannot provide information on overall statistical effects due to the limited number of studies available and their heterogeneity in study design and method of analysis. By using three different databases (ERIC, PsycINFO, and Web of Science) and a complementary search in 16 relevant conference proceedings, our efforts sought to minimize publication bias. In addition, it is important to consider that the results are limited to research papers that have been assigned to the terms SL/DL by the respective authors. Although the individual studies were treated the same, some focused on the overarching concepts of SL/DL, while others mentioned them only briefly and focused rather on specific sub-concepts. Moreover, studies that may address similar topics, but did not explicitly mention one of the terms were not included in the review. Hence, while our systematic review provides illuminating insights into a subset of the research landscape, its coverage may not be representative of the entire expanse, which likely transcends the scope reflected in our data.

7 Conclusion

Over the past decades, the relevance of knowledge and skills related to SL/DL have become increasingly important in broader public perception and research (OECD, 2021). This has led to adaptations of school curricula internationally, with an increased focus on promoting SL/DL in K-12 STEM education. To prepare students to become statistically and data literate citizens, teachers need to be competent in these areas themselves. This includes knowledge of content and pedagogy related to SL/DL as well as competence to use appropriate technology (Batanero et al., 2011; Burrill, 2023). This review identified a relevant need to enhance teachers’ SL/DL-related knowledge and skills and offers a comprehensive overview of pedagogical approaches employed within teacher education and their professional development. By recognizing and addressing the unique characteristics and needs of both pre-service and in-service teachers, educators and researchers can build on these pedagogical approaches to effectively promote SL/DL across different stages of a teacher's professional journey. However, further research is necessary to investigate the effectiveness of these approaches in controlled experimental designs and large upscaled implementation studies. The overarching objective for K-12 STEM education should involve empowering teachers to competently prepare their students to become statistically and data literate. However, research examining the impact of teacher variables on their teaching and ultimately on the promotion of SL/DL in their students is so far very rare. This calls for future research that considers and systematically examines the relationships among variables related to SL/DL on the part of teachers, their classroom practices, and their students in within a comprehensive framework.