Introduction

Science education research and curricular reforms promote inquiry as an effective approach to science teaching (for historical and narrative reviews, see Chiappetta, 2008; Crawford, 2014). Using such a pedagogical approach, students mirror the endeavors of scientists. This includes different practices, such as formulating research questions, designing and conducting experiments, analyzing data, and drawing conclusions (NGSS Lead States, 2013; Osborne, 2014; Romero-Ariza et al., 2019; Toma et al., 2017). A unique approach to inquiry has remained elusive and many inquiry cycles have been developed (Pedaste et al., 2015). Furthermore, Crawford (2014) indicated that student participation in scientific practices can take many forms, including project-based science, real science, citizen science, or model-based inquiry. Regardless of these distinctions, inquiry investigations are characterized by (i) a central research issue that leads to (ii) specific procedures and exploration to (iii) obtain specific results.

Efforts to define different types of inquiry have been ongoing for decades (Herron, 1971) and continue nowadays (Vorholzer & von Aufschnaiter, 2019). Despite the various interpretations and definitions of inquiry, the use of a continuum to categorize the various types of inquiry is common in the literature (Banchi & Bell, 2008; Vorholzer & von Aufschnaiter, 2019). At the lower end of the continuum, students engage in confirmation inquiry when the research question, procedure, and potential results are provided in advance by the teacher. Some authors argue that this approach is more akin to cookbook-style hands-on activities in which students follow step-by-step instructions to reach a specific goal (Martin-Hansen, 2002; Osborne, 2014). In the second level, students enact structured inquiry when they investigate a phenomenon for which the answer is unknown; however, the teacher still determines the research question and the procedure be followed, and scaffolding strategies are used to assist students in conducting the investigation. In the third level, guided inquiry, students develop a procedure and gather results to answer a teacher-supplied research question. Finally, at the upper end of the continuum, students engage in an open inquiry by following self-directed investigations.

Adding to the ambiguity surrounding inquiry, a growing body of research suggests that teaching science through inquiry approaches may be harmful to students’ learning (Kirschner et al., 2006, 2018; Sweller, 2021). In this regard, several studies investigated a central element of the inquiry process: whether withholding answers from students would improve learning outcomes, but the results were mixed (Zhang, 2018, 2019). On the other hand, correlational studies based on TIMSS and PISA data reveal a pattern suggesting that the more students participate in inquiry-based investigations, the lower their science learning outcomes (for an in-depth discussion, see Zhang et al., 2021). However, controlled investigations considering the different elements of inquiry instruction within a whole program or intervention resulted in positive results. In this sense, guided and open inquiry is effective in improving science and mathematics achievement, and motivational outcomes (for reviews, see Lazonder & Harmsen, 2016; Savelsbergh et al., 2016).

However, despite positive results, many obstacles limit the enactment of guided and open inquiry approaches. Constraints include lack of resources, content-heavy curricula, low self-efficacy, and limited pedagogical knowledge (Toma et al., 2017; Chichekian et al., 2016; Fang, 2020; Zhang, 2016). As a result, guided and open inquiry is rarely used in favor of lecture-based lessons or, at best, confirmation and structured inquiry (Romero-Ariza et al., 2019). However, there is uncertainty about the value, if any, of short-term, confirmation, and structured inquiry. This research gap is even more pronounced at the elementary school level. Indeed, most research on inquiry has been conducted with secondary school students (for comprehensive reviews, see Furtak et al., 2012; Lazonder & Harmsen, 2016; Savelsbergh et al., 2016). A few existing studies examined the differential effectiveness of the different types of inquiry, although most research has focused on guided and open inquiry or was conducted with secondary school students. For example, Sadeh and Zion (2012) compared open and guided inquiry. Their findings show better attitudinal scores for the open inquiry group. Bunterm et al. (2014) compared structured and guided inquiry. They concluded that secondary school students benefited more from the guided inquiry. Such effect was significant for content knowledge, science process skills, and scientific attitudes. Kuo et al. (2020) reported on the effectiveness of guided inquiry strategies in improving expectancies and values of science learning. Conradty and Bogner (2019) concluded on the effectiveness of open inquiry-based working stations in improving intrinsic motivation. Finally, Schmid and Bogner’s (2017) found that a 3-h structured inquiry unit improved secondary students’ self-determination.

In short, existing studies support the effectiveness of guided and open inquiry in improving affective outcomes related to science (for a meta-analysis, see Aguilera & Perales-Palacios, 2020). Yet, research has largely overlooked whether such positive effects hold when confirmation and structured inquiry approaches are enacted, so the impact of these teaching strategies on school science motivations is still poorly understood (Zhang & Cobern, 2021). Consequently, this investigation explores the effect of confirmation and structured inquiry strategies for improving elementary students’ school science motivations—conceptualized using the expectancy-value model of achievement motivation (Eccles & Wigfield, 2020; Wigfield & Eccles, 2020)—when compared to a control, lecture-based group resembling the science education teaching context in Spain. Contrary to Zhang (2018, 2019), this study did not focus on which specific feature of the inquiry approach is effective or not. Instead, this investigation considered confirmation and structured inquiry as a whole, cohesive program or approach, thus aligning with long-standing definitions from the literature and existing studies comparing types of inquiry (e.g., Bunterm et al., 2014; Sadeh & Zion, 2012; Schmid & Bogner, 2017). A previous study that focused on this topic found no benefit of confirmation and structured inquiry in improving elementary students’ attitudes towards science (Toma , 2022). Hence, this investigation advances in this line of research by specifically addressing the following research question:

  1. (i)

    When compared to lecture-based teaching strategies, what effects do short-term, confirmation, and structured inquiry have on students’ expectancies of success and intrinsic values of school science?

Expectancy-Value Model

The expectancy-value model of achievement motivations served as the theoretical framework for this study (Eccles et al., 1983; Wigfield & Eccles, 2020). It is a parsimonious framework and stands as one of the most widely adopted theories for studying student motivation in science and mathematics (Abraham & Barker, 2014; Ball et al., 2017; Caspi et al., 2019; Gottlieb, 2018; Jiang et al., 2018; Wang & Degol, 2013). This theory embodies two main theoretical constructs affecting individuals’ achievement motivation. On the one hand, expectancies of success are conceptualized as internal beliefs about the ability to successfully perform a task or activity. On the other hand, subjective task values refer to how much value is placed on a specific task or activity. There are four subjective task values, namely (i) intrinsic value, which refers to the level of interest and enjoyment related to participation in a given task; (ii) attainment value, which includes the importance attached to participating in a given task”; (iii) utility value, which refers to how useful the given task is in achieving future goals; and (iv) cost, defined as what is lost or given up for participating in a given task. Specifically, only expectancies of success and intrinsic value were assessed since children’s attainment values are empirically indistinguishable from utility value and, similarly to cost, attainment values are stable and unlikely to improve based on short-term interventions (Ball et al., 2017; Eccles & Wigfield, 2020; Wigfield & Eccles, 2020).

The significance of improving students’ motivation, as defined by the expectancy-value model, has been extensively researched. Expectancies and task values are determinants of desirable educational outcomes in science and math education, such as improved performance and career choice (for reviews, see Eccles & Wigfield, 2020; Rosenzweig et al., 2019; Wigfield & Eccles, 2020). For example, Thomas and Strunk (2017) reported that parents’ and elementary school students’ expectancies of success affected science achievement. On the other hand, expectancy-value constructs were revealed to be important determinants of students’ sustained engagement in physics, which affected their decision to continue studying physics (Abraham & Barker, 2014). In the same vein, Ball et al. (2017) investigated the relationship between students’ academic expectancies, task values, and STEM attitudes. They concluded that intrinsic values and utility values were the constructs that most influenced students’ positive attitudes towards STEM disciplines.

In short, the expectancy-value model (Eccles, 2005; Eccles & Wigfield, 1995) stands as an influential theoretical framework for understanding and predicting the relationship between motivational factors and science education outcomes (Abraham & Barker, 2014; Andersen & Ward, 2014; Caspi et al., 2019; Wang & Degol, 2013). As a result, interventions using inquiry-based teaching methodologies that signal the interesting, fun, and useful aspect of science education may infuse the development of students’ achievement motivations (Aguilera & Perales-Palacios, 2020; Ball et al., 2017).

Method

Research Design and Participants

A single-blinded, randomized post-test-only control group design was adopted, with two treatment conditions (confirmation and structured inquiry) and one control group (lectures) (Shadish et al., 2002). Hence, participants were naïve of their pedagogical condition assigned. A statistical power analysis was performed for sample size estimation using G*Power software (Faul et al., 2007). A meta-analysis found a medium effect size for inquiry-based interventions (Aguilera & Perales-Palacios, 2020). Therefore, for alpha = .05 and power = 80%, it was determined that a medium effect size of Cohen’s (1988) partial eta squared of .06 would require a minimum sample size of 102 participants—34 for pedagogical condition—for estimating global effects between the three pedagogical conditions groups (control, confirmation, and structured inquiry) and two outcome variables (expectancies of success and intrinsic values).

A total of 119 students (53.8% girls) enrolled in the sixth grade of elementary education participated in this study. The sample was drawn from six classrooms in three elementary schools from an urban area of a medium-sized city located in central-northern Spain, called Burgos. The mean age of the participants was 11.25 years (SD = .43). Using classroom clusters, participants were assigned to lecture-control group (n = 39), confirmation (n = 37), or structured inquiry (n = 43). It was ensured that each school had two different conditions, such as control–confirmation, control–structured, or confirmation–structured to minimize possible differences between the school environments and student clientele. Students attended the entire intervention and did not withdraw from the study, nor were absent from any session.

Educational Intervention

Two short-term units of three 60-min teaching sessions were designed to reflect the Spanish science curricula and the conventional context (see Online Resource 1 for details). The units addressed the curricular content of inclined planes and air resistance, respectively. Table 1 summarizes the differences and similarities between pedagogical conditions. It should be noted that this study considered the different instructional elements of inquiry as a cohesive product. Therefore, no attempt was made to isolate and test each of the features of the inquiry teaching strategy separately. This choice is intended to represent the teaching practices of the Spanish context and to reflect how different types of inquiry are being defined and have been investigated in the literature (for a discussion, see Zhang et al., 2021). In particular, science teaching in Spain is still primarily lecture-based, with little teacher demonstration or student experimentation or manipulation of lab material (Gil-Flores, 2014; Romero-Ariza et al., 2019; Toma et al., 2017). On the other hand, research has explored inquiry and its elements by designing interventions that are exemplary of different types of inquiry, thereby simultaneously testing different instructional elements common to inquiry strategies.

Table 1 Similarities and differences between pedagogical conditions

Control group students were introduced to such content through lectures, using the textbook as the main teaching resource. Consistent with the conventional Spanish educational milieu (Romero-Ariza et al., 2019), students in the control group have not engaged in any investigation and did not address any research question, experimental procedure, or investigation results. Rather, teachers in the control groups used lecture-based strategies such as reading from the textbook, whole class explanations, short written quizzes, and written gap-filling exercises.

On the other hand, inquiry strategies were used in the treatment groups. Classroom teachers used instruction manuals and necessary materials facilitated by the researcher to ensure standardized implementation of the units and increase implementation fidelity. Specifically, the first inquiry-based unit addressed inclined planes through the following research question: What factors affect the amount of force that must be used to move an object on an inclined plane? The experimental procedure consisted of designing inclined planes of different angles, lengths, and roughness and recording the amount of force needed, using a dynamometer, to move a block up the ramps. The second inquiry-based unit introduced air resistance concepts through the following research question: What factors influence the descending speed of a parachute? The experimental procedure consisted of designing parachutes of different sizes, shapes, rope lengths, and materials and recording their descent time.

Consistent with the inquiry continuum (Banchi & Bell, 2008; Vorholzer & von Aufschnaiter, 2019), students in confirmation and structured inquiry pedagogical conditions received the research question and experimental protocol from the classroom teachers. Furthermore, only the students in the confirmation inquiry groups were introduced to the results before the experimentation was carried out. Thus, they knew in advance what they were going to find out during the experimental procedure. The structured inquiry groups, on the other hand, were not given explanations about the outcomes of the experimental procedure, thus engaging in the protocol without knowing the answer to the research question. Also, in the structured inquiry, the research questions and experimental methodologies were introduced after scaffolding oral questions to the students, such as What research question can we formulate concerning this phenomenon? and What experimental procedure can we use to answer this question? Since they also enacted the established protocol, structured inquiry students did not have any opportunity to discuss their research questions or explore their own designed experimental procedures. Therefore, the sole difference between the two treatment conditions was that the results of the experiment were withheld from students in the structured group, whereas they were disclosed to students in the confirmation group before the experimental procedure.

Data Collection Instrument

A series of literature reviews on measurement instruments on attitudes and motivation found a shortage in questionnaires of adequate theoretical underpinnings and solid evidence of validity and reliability (Toma & Lederman, 2020; Blalock et al., 2008; Potvin & Hasni, 2014). This gap is further exacerbated in the Spanish context (Toma, 2021a). Hence, an ad hoc measure rooted in the expectancy-value literature was designed (Eccles & Wigfield, 2020; Wigfield & Eccles, 2020). The procedure for the design and psychometric evaluation of the instrument used in this study is detailed in Online Resource 2.

In short, items were adapted following cross-cultural validation procedures (Beaton et al., 2000). This included forward translation into Spanish, back-translation into the source language, committee revision of the items (the author and the six teachers from the participating schools), and a think-aloud pilot study with six students to assess item comprehension. This process resulted in a nine-item questionnaire, administered using a 5-point Likert scale (1 = totally disagree; 2 = disagree; 3 = nor disagree or agree; 4 = agree; 5 = totally agree). Exploratory and confirmatory factor analysis provided evidence for construct validity and Cronbach alpha for internal consistency reliability. The first construct measured intrinsic values through four items, such as “I enjoy school science” and “I am interested in the things I learn in school science”. The second construct measured expectancies of success through five items, such as “I can get good grades in school science” and “School science is easy for me”. This factor structure yielded a good model fit and adequate reliability consistent with the expectancy-value model (Eccles & Wigfield, 2020). The questions were administered in paper and pencil format, and they were double-checked before being handed in by the students to ensure that all items had been answered.

Data Analysis Plan

Multilevel analysis was hampered by the low sample size and number of clusters required for such a procedure (McNeish & Stapleton, 2016). Hence, data were analyzed using a two-way multivariate analysis of variance (MANOVA), since expectancies of success and intrinsic values are conceptually related (Eccles & Wigfield, 2020). Pedagogical conditions (control vs. confirmation vs. structured) and gender were the independent variables. Expectancies of success and intrinsic values were the dependent variables. Preliminary checks of assumptions supported the use of MANOVA (Knapp, 2018). Three cases were deleted for univariate outliers. Mahalanobis distance found no multivariate outliers. Score distributions satisfied the criterion of normality. The correlation between the dependent variables met the assumption of multicollinearity and singularity (r = .38). The Box’s test indicated that the assumption of homogeneity of variance–covariance was not violated (M = 20.07, p = .208). Levene’s test suggested that the assumption of homogeneity of variance was met (p = .366 to .919).

Validity of the Study

Measures were taken to reduce threats to the internal and external validity of this investigation (Shadish et al., 2002): (i) adequate sample size to avoid an under or over-powered study design; (ii) implementation represented the Spanish educational system and was implemented by the classroom teachers within the natural context; (iii) implementation fidelity monitored through non-participant observations; (iv) control of maturation, selection threat, and pre-existing differences between groups by randomly assigning classroom clusters to pedagogical conditions; and (v) adequate research design—randomized post-test-only control group design—to avoid testing effect threats (i.e., participants remembering their first answers or adapting their behavior because they are being tested), especially considering the short duration of the intervention.

Results

There were no secondary interaction effects between gender and pedagogical conditions, Pillai’s Trace < .00, F (4, 220) = .09, p = .99. Thus, the intervention had no differential effect for girls and boys. Likewise, the main effect for gender was not statistically significant, Pillai’s Trace = .001, F (2, 109) = .071, p = .93. Hence, no significant differences were found between girls and boys in any pedagogical condition.

Students in the structured inquiry group reported marginally higher scores than their counterparts in the confirmation inquiry and control group (see Fig. 1). However, the main effect for the pedagogical condition was not statistically significant either, Pillai’s Trace = .05, F (4, 220) = 1.51, p = .20. This suggests that pedagogical condition groups did not differ in expectancies of success and intrinsic values after participating in the two intervention units.

Fig. 1
figure 1

Means and standard deviation for pedagogical conditions

Discussion

Guided and open-ended inquiry has been perennially advocated as an effective teaching strategy (Crawford, 2014). Indeed, a recent meta-analysis reported a moderate effect size of inquiry on students’ attitudes (Aguilera & Perales-Palacios, 2020). Similarly, Weisgram and Bigler (2006) found that hands-on science activities performed during out-of-school workshops improved students’ achievement motivations. Likewise, Fielding-Wells et al. (2017) identified features of inquiry as critical to increasing students’ success expectations (i.e., exploration, open-endedness, and iterative trial and error) and values (i.e., increased autonomy and control).

However, given major constraints, science educators rely on lecture-based teaching strategies, and when inquiry is enacted, confirmation and structured units are adopted (Gil-Flores, 2014; Romero-Ariza et al., 2019). Yet, the value of such approaches has been neglected in the literature. Therefore, this study explored the impact of short-term, confirmation, and structured inquiry instruction on elementary students’ achievement motivations. Based on expectancy-value theory (Eccles & Wigfield, 2020; Wigfield & Eccles, 2020), there were no differences in achievement motivation between students who participated in the confirmation or structured inquiry pedagogical condition compared to the control, lecture-based groups. Hence, the findings of this investigation provide no empirical support to the contention that such approaches could be beneficial for students’ expectancies of success and intrinsic values in school science. Results of this study should be interpreted with the understanding that pedagogical conditions differed in several aspects, thus representing each type of inquiry as a whole product. As a result, conclusions about the (in)effectiveness of each component of inquiry teaching on its own should be avoided. Future research that isolates instructional features of inquiry-based teaching is desired, as it has the ability to reveal specifics about what is effective and what is not (Zhang, 2019).

This study stands at odds with previous research on inquiry (Bunterm et al., 2014; Conradty & Bogner, 2019; Kuo et al., 2020; Schmid & Bogner, 2017). However, any such comparisons should be made with caution. First, previous studies measured different constructs as foci or drives of motivations, and no reviewed investigation measured achievement motivations as depicted by the expectancy-value theory; thus, a comparison of the present findings with previous studies is rather difficult. The one that comes closest to measures of achievement motivations was the study by Schmid and Bogner (2017). Of their three motivational constructs, only self-determination was influenced by the intervention. However, career motivations and self-efficacy, which are conceptually more closely related to expectancies of success measured in this study (Eccles & Wigfield, 2020), were not affected by the intervention. Another aspect to consider is that the present study focused on sixth-elementary graders. Yet Kuo et al. (2020) sample was composed of 8th graders; Schimd and Bogner (2017) addressed 9th graders; Bunterm et al. (2014) study included 7th and 10th graders; and Sadeh and Zion (2012) focused on 11th and 12th graders. Therefore, the results of the present study are not generalizable to secondary school grades.

The duration of interventions and the amount of teacher guidance also differed between studies. Kuo et al. (2020) adopted guided inquiry for six units of 90–180 min each. Conradty and Bogner’s (2019) intervention lasted only four hours, making it comparable to the one reported in the present study. Yet, it adopted open inquiry strategies. Similarly, the study by Sadeh and Zion (2012) reported on the effects of a multi-year open and guided inquiry intervention. Only the study by Schmid and Bogner (2017) could be directly comparable to the present research. In this respect, their intervention lasted three hours in total and consisted of structured inquiry lessons. In this case, as mentioned, our findings do not align with theirs. Given that existing research on inquiry teaching supports the efficacy of short-term interventions (e.g., Conradty & Bogner, 2019; Schmid & Bogner, 2017), and that the expectancy-value literature establishes that interventions as short as a single class improves task values (for a review, see Rosenzweig et al., 2022), it is possible to conclude that the findings of this study are not due to exogenous variables but of the ineffectiveness of six-hour-long confirmation and structured inquiry interventions in improving achievement motivations.

Implications, Limitations, and Avenues for Future Research

The current investigation constitutes a meaningful contribution to the body of knowledge on inquiry strategies. Although current literature supports the effectiveness of inquiry approaches, caution should be exercised as most studies focused on the secondary education stage and included guided and open inquiry. Such strategies drive teachers to experience most of the constraints related to this pedagogical strategy (Furtak et al., 2012; Lazonder & Harmsen, 2016). Therefore, this study contributes to reducing the gap in the literature on the value, if any, of short-term, confirmation, and structured inquiry approaches. Likewise, no previous studies focused on achievement motivations at elementary grades, as conceptualized by the expectancy-value theory. Thus, to the best of the author’s knowledge, this investigation is one of the first to examine confirmation and structured inquiry effect on expectancies of success and intrinsic values in elementary school science. This is essential since evidence indicates that favorable views of science deteriorate as students enter secondary education and that it is difficult to change their scientific aspirations after elementary education (DeWitt & Archer, 2015; Maltese & Tai, 2011; Said et al., 2016; Tytler & Osborne, 2012). Hence, the upper elementary school grades seem to be an appropriate period for the development of educational interventions, as the literature on expectancy-value stresses that achievement motivation can be enhanced in younger students (Eccles & Wigfield, 2020; Wigfield & Eccles, 2020).

Two limitations should be acknowledged. At the time of this study, there were no known expectancy-value instruments for the Spanish context that could be used with elementary school students (c.f. Toma, 2021b). As a result, an ad hoc measure was developed by adapting existing expectancy-value questionnaires. Such an instrument was found to produce valid and reliable results based on factor analyses and Cronbach’s α index. However, while broad items like the ones used in this study are frequent in the expectancy-value literature (e.g., Andrews et al., 2017; Ball et al., 2017; Kosovich et al., 2015; Wigfield & Cambria, 2010), the expectancies of success items were likely limited in capturing students’ deeply ingrained beliefs, especially given the short duration of the intervention. As a result, items referring to the curricular content addressed in each unit could have been more effective. For example, instead of “I do great in school science”, more sensible items could have been used, such as “I can explain what determines the speed for parachute falling”. Therefore, this aspect is worth investigating in future studies.

The second limitation is related to the sample. Students participating in this study were nested in six classrooms from three different schools. A hierarchical multilevel analysis would have revealed information about the possible influence of class and school membership on students’ achievement motivation. However, as stated in the “Method” section, such an analysis required a larger sample size and more clusters, which were beyond the bounds of this study’s resources.

Despite these limitations, the findings of this study have major implications. Science educators tend to make use of confirmation and structured inquiry (Lazonder & Harmsen, 2016). This is the case in Spain, where the new standards for science education embrace inquiry pedagogies (LOMCE, 2013; Romero-Ariza et al., 2019). Given the findings of this study, it may be that efforts to improve teaching practices through confirmation and structured inquiry may be futile. For this reason, educational measures should be taken to improve both pre- and in-service teachers’ pedagogical content knowledge of guided or open-ended inquiry approaches instead, for which there is more evidence of effectiveness (Aguilera & Perales-Palacios, 2020; Bunterm et al., 2014; Kuo et al., 2020; Sadeh & Zion, 2012). Likewise, future studies on confirmation and structured inquiry are warranted. In this regard, additional research is needed to determine their educational value for low-achieving students. Similarly, studies examining the implementation of short-term, confirmation, and structured inquiry units over a semester or academic year are required. In the long run, sporadic episodes of inquiry learning are more likely to have a beneficial effect on students’ motivation than just one intervention.