Introduction

Human reasoning has been a topic of widespread study since the time of Aristotle and is still a key area in psychological, theoretical and empirical research today. Reasoning plays an essential role in both educational settings and the workplace in modern society, where more information and knowledge created in a shorter time place us under increasing pressure to manage it all. Among forms of reasoning, inductive reasoning (IR) and scientific reasoning (SR) are increasingly considered in school contexts (Van Vo, 2022). Instead of learning extensive content knowledge in individual subjects, students need to be equipped with more general thinking skills to manage information. Thus, mastery of reasoning skills is a central goal of STEM (science, technology, engineering and mathematics) curricula that contribute to cultivating twenty-first century competencies (Voogt & Roblin, 2012). Additionally, students with greater motivation to reach their aims are more likely to succeed than those who are more talented but do not set their own goals and keep focusing on them (Duckworth et al., 2011). In learning science, motivation is not only the main factor in explaining science attitude, but also an essential predictor of science learning performance (Chan & Norlizah, 2018; Patrick et al., 2009) and academic achievement (Cavas, 2011; Dermitzaki et al., 2013).

Children’s activities in school and family environments have an influence on academic achievement (Pintrich & Schunk, 2002). Previous studies have shown that cognitive abilities (i.e. reasoning), motivational behaviour (i.e. motivation toward learning) and background (i.e. age and parents’ education) influence students’ STEM performance in schools. IR, one of the core components of fluid intelligence, has predicted children’s learning ability as well as intelligence (Mayer et al., 2014; Strobel et al., 2019) and has been confirmed as a significant predictor of academic performance (Chuang & She, 2013; Van Vo & Csapó, 2022). Empirical studies have indicated that students who score higher on an IR test are likely to gain better achievement in mathematics and science in schools (Adey & Csapó, 2012; Nunes & Csapó, 2011; Strobel et al., 2019). Similarly, SR, a type of reasoning skills integrated into domain-specific science subjects, had a strong positive correlation to children’s science achievement (Coletta & Phillips, 2005; Lawson, 2000).

A better understanding of cognitive development, motivation trends and the role of parental factors in impacting students’ academic achievement plays a fundamental role in individualised teaching in school practice. However, the research on links between IR, SR and SM as well as parental factors in predicting students’ achievement in STEM is rather scarce. Thus, this cross-sectional study aims to investigate the children’s patterns of IR, SR and SM across grade cohorts. Furthermore, we examine the extent to which these variables interact with parental factors (i.e. parents’ education and parental involvement) in predicting students’ STEM achievement at different grade levels.

Theoretical Background

Inductive Reasoning

Inductive reasoning refers to a cognitive process in which particular facts or individual cases are gathered to establish a general conclusion (Adey & Csapó, 2012; Sternberg & Sternberg, 2012). According to Kinshuk et al. (2006), IR can be considered one of the seven primary mental skills that contribute to intelligence. Empirical researchers have demonstrated the importance of IR in learning ability, particularly in science (Adey & Csapó, 2012) and mathematics (Nunes & Csapó, 2011). It is closely aligned with scientific reasoning (Mayer et al., 2014) and problem-solving skills (e.g. Molnár et al., 2013; Schweizer et al., 2013). Generally, the tasks to assess IR ability may be grouped into four main categories: analogies, series completion, classifications and matrices (Adey & Csapó, 2012; Sternberg & Sternberg, 2012; Van Vo & Csapó, 2022). A non-verbal task is most frequently used to assess IR capacity (Van Vo & Csapó, 2022). It is common to use these types of tasks on intelligence tests, e.g. Raven’s Standard Progressive Matrices (McCallum, 2017).

Scientific Reasoning

With a growing emphasis on science education, a further form of reasoning has become a focal research topic; the term SR has been used with a domain-specific approach to science subjects. According to Lawson (2009), SR is one of the foundational pillars of scientific literacy, along with content knowledge in science. SR is defined as an active procedure of interrelating a series of reasoning patterns (Lawson, 2004) and metacognitive processes to generate, test, and adjust theories and hypotheses (Zimmerman, 2007), in which analogical reasoning is used to generate hypotheses and then combinatorial reasoning is applied to create a list of possible combinations of hypotheses (Lawson, 2004). Inquiry activities support the process of knowledge acquisition and knowledge application, which contributes to the development of SR abilities. In turn, integration of scientific reasoning into inquiry activities can enhance both students’ science content knowledge and their scientific reasoning (Chen & She, 2015; Lawson, 2004). Although reasoning abilities are naturally present in early childhood, these capacities can be cultivated in the long term through the stimuli and information process in educational contexts (Köksal-tuncer & Sodian, 2018; Zimmerman, 2000, 2007). Therefore, the main goals of an interdisciplinary approach among subjects tend to focus on developing both content knowledge and SR ability in school-age children.

There are several tasks that can be used to assess SR ability, so-called conservation, control of variables, proportions and ratios, probability, correlational reasoning, combinatorial reasoning and hypothetical-deductive reasoning (e.g. Adey & Csapó, 2012; Han, 2013; Lawson, 2000). As discussed above, the term SR refers to applying cognitive abilities, including domain-specific forms of reasoning in particular science subjects, so some researchers (e.g. Korom et al., 2017) considered IR tasks where the elements are constructed of science content as a kind of scientific reasoning task. To complete these tasks, participants need to apply IR skills to specific science content knowledge.

Science Motivation

Motivation toward learning results from the relative dynamics of dispositional and contextual variables (Pintrich & Schunk, 2002). According to Garcia and Pintrich (1995), academic motivation is determined by students’ goals, their perception of their own skills and their perception of criticism. Meanwhile, in socio-cognitive theory, the interaction between environment and students’ background variables is often examined in conjunction with motivation. Numerous previous studies have demonstrated that students who reported higher motivation tended to perform better in learning science disciplines (e.g. Cavas, 2011; Chan & Norlizah, 2018; Dermitzaki et al., 2013). The 5-year panel investigation by Hwang et al. (2016) also found a longitudinal causal relationship between school achievement and self-efficacy. Several instruments have been recommended in the literature for measuring SM. For example, Glynn and his colleagues (2009) proposed the Science Motivation Questionnaire II to examine five motivational factors: intrinsic motivation, self-determination, self-efficacy, career motivation and grade motivation. Additionally, Tuan et al. (2005) developed the students’ motivation towards science learning (SMTSL) questionnaire, which combines constructivist learning and motivation and uses a 5-point Likert format, to assess science motivation in self-efficacy, active learning strategies, science learning value, performance goals, achievement goals and learning environment stimulation.

Research on the Development of Reasoning and Motivation

Children’s reasoning capacities are influenced by their physical development and social experience (Kwon & Lawson, 2000). Empirical studies confirmed that children’s IR capacity develops grade by grade, but growth rates have seemed different between grade levels. Specifically, Muniz et al. (2012) showed that students improved their IR capacity in primary school with a steady development in the 3rd–11th-grade period, but the most rapid development was noted between the 6th and 7th grades (12–14 years) (Csapó, 1997; Díaz-Morales & Escribano, 2013; Molnár et al., 2013; Van Vo & Csapó, 2020). Likewise, SR increased gradually through the general education level (Korom et al., 2017; Tairab, 2015), but the growth rate tended to decrease after the 9th grade (14 years) (Ding, 2018; Kwon & Lawson, 2000), with only a little improvement observed across the four years of tertiary education (Ding et al., 2016; Han, 2013).

Both contextual and cognitive developmental factors have a considerable impact on children’s academic achievement (Anderman & Dawson, 2011). The initial passion for formal learning disperses through middle school before dropping dramatically in high school (Hoffman, 2015). Furthermore, empirical investigations (e.g. Dorfman & Fortus, 2019; Józsa et al., 2017) found that students’ SM tended to drop as grade levels rise. A 2-year longitudinal study by Bouffard et al. (2001) also observed that self-efficacy and learning strategies declined in students after reaching upper secondary school. Meanwhile, a 5-year longitudinal investigation by Gottfried et al. (2001) showed that academic intrinsic motivation fell in a linear pattern through the middle elementary to the high school years, but the rate of decline varied depending on the subject matter, with mathematics suffering the largest drop, but social studies remained unchanged.

The Role of Reasoning Skills, Motivation and Parental Factors in Predicting STEM Performance

Empirical studies have indicated that IR has a close relation to SR (e.g. Greiff & Neubert, 2014; Mayer et al., 2014; Molnár et al., 2013; Rudolph et al., 2018). Specifically, IR was a main determinant that notably contributed to explaining individual differences in students’ scientific reasoning competencies in elementary schools (Mayer et al., 2014). Furthermore, a study by Molnár et al. (2013) showed that SR has a significant influence on students’ performance on IR across the 5th- and 7th-grade levels. A majority of empirical studies have confirmed the important role of both IR and SR in predicting STEM performance in schools (Boroş & Sas, 2011; Mollohan, 2015; Stender et al., 2018; Venville & Oliver, 2015). Salihu et al. (2018) reported that IR and mathematics achievement showed a significant positive correlation. SR was also found as an essential skill for learning science in general and in separate subjects (mathematics, physics, biology and chemistry) (Zimmerman, 2000). SR was one of the significant factors in explaining children’s achievement in learning biology (Lawson, 2000) and physics (Coletta & Phillips, 2005; Van Vo & Csapó, 2021b).

Moreover, De Silva et al. (2018) demonstrated that students’ self-regulation predicted their success in learning science. A meta-analysis review by Kriegbaum et al. (2018) found that intelligence and motivation were predictive factors of academic success. Steinmayr and Spinath (2009) revealed that both intelligence and motivation were significant factors in predicting children’s school success, in which motivation explained school achievement better than intelligence. Furthermore, the recent empirical study by Van Vo and Csapó (2021a) confirmed that there was a link between IR and science motivation, but not a strong one.

As regards parents’ education, a review study by Van Bavel et al. (2018) showed that the gap of parents’ education in families has been narrowed and even reversed in most Western and many non-Western countries in recent decades. Most existing studies have found that parents’ education positively affected children’s reasoning abilities, academic motivation, and school performance. For example, Kong et al. (2015) confirmed that parents’ education had a positive relation with children’s fluid intelligence ability, in which the mother’s education played a more important role than the father’s. The higher the education level of their mother (Mousa & Molnár, 2020) and father (Vo & Csapó, 2020), the higher the scores they achieved on the IR tests.

In addition, parental involvement in schooling was positively linked to students’ motivation toward learning science (Fan et al., 2012; Gonzalez-DeHass et al., 2005; Van Vo & Csapó, 2021a). A longitudinal study by Fan and Williams (2010) demonstrated that a number of interactive variables, including intrinsic motivation, parental involvement and engagement, had an aggregate impact on learning mathematics. Likewise, parents’ involvement in school-related activities and their interest in their children’s school activities positively correlated with students’ motivation and achievement in learning science (Organization for Economic Cooperation and Development [OECD], 2017a; Shapira-Lishchinsky & Zavelevsky, 2020). Furthermore, a study by Ganzach (2000) demonstrated that parents’ education, cognitive abilities and motivation all influence academic performance, in which children’s cognitive ability has a positive relationship with their mother’s education level but not with that of their father.

The Study Context

There are four levels in the Vietnamese national education system: early childhood education (nursery and kindergarten), general education (primary education, lower secondary education and upper secondary education), professional education (professional secondary education and vocational training) and higher education (college undergraduate, master’s and doctoral courses) (Vietnam National Assembly, 2006). All children aged 6 to 10 years must complete the primary education level. Children aged 11 begin 4 years (6th to 9th grades) at the lower secondary level, while upper secondary education is for students mostly aged 15 to 18 years. The national education programme involves the same objectives, content, curriculum, textbooks and regulations related to completion in public institutions across the country (UNESCO, 2011).

In principle, thinking and reasoning are implicitly embedded in the core curricula. However, in Vietnamese schools, teaching and learning are often criticised for being exam-based and focusing more on passing exams than using knowledge in practice (Nhat et al., 2018). This cross-sectional study attempted to provide a partial picture to estimate the effects of reasoning abilities, students’ science motivation and parental factors in learning STEM subjects at the secondary education level in the context of Vietnam.

Research Questions

The foci of our study are to explore the changing patterns of IR, SR and SM across the grade cohorts. We also examine the extent to which IR, SR, SM and parental factors (e.g. parents’ education and parental involvement) can predict STEM performance in secondary school students. Hence, our adapted instruments were tailored to address two central research questions:

  1. 1.

    What are the differences between the IR, SR and SM of the students in different grades?

  2. 2.

    To what extent do IR, SR and SM interact with parental factors (i.e. parents’ education and parental involvement) in predicting STEM achievement across grade levels?

Methods

Participants

The study assessed 726 students (boys: 47.9%; girls: 52.1%) in six public schools in the southern province of An Giang (Vietnam). The average age of the participants was 14.0 years. To ensure that every cohort properly represented the grade level, we attempted to involve two different schools (one from the city centre and another from the outskirts) with at least two classes per school for each grade-level cohort. We used probability sampling based on clusters of 146 potential classes, which were provided by the principals of the participating schools, with 19 intact classes selected randomly. Table 1 presents the features of the participants in each grade level and in the whole sample. The main study was conducted in the first semester of the 2019–2020 academic year. The students completed the test instruments in 45 min under the supervision of their teachers and our assistant teachers. The students took the tests and questionnaires in either paper-and-pencil or online format. In total, only 27.1% of the study participants joined in the online administration mode. With respect to the socioeconomic factors of the grade cohorts, Table 2 shows the distribution of the education levels of the parents of those in the study sample. It seemed that the education levels of the parents of the lower secondary school students were somewhat higher than those of the upper secondary school students.

Table 1 The participants
Table 2 Parents’ education levels

Instruments

Inductive Reasoning Test

To measure IR, we included 20 items from the item bank developed at the University of Szeged. The items covered four subtests: figure series completion, figure analogies, number series and number analogies (Korom et al., 2017; Pásztor, 2016). The reliability and validity of the test items have been confirmed through assessments in a number of countries (Hungary, Finland, Namibia, China, Indonesia and Vietnam) (e.g. Csapó et al., 2019; Kambeyo & Wu, 2018; Korom et al., 2017; Molnár & Csapó, 2011; Wu & Molnár, 2018).

Basically, the selection of item criteria was based on the construction of each item and empirical evidence from previous studies. Each item contains a set of elements (figures or numbers) which were organised based on particular rules. The diversity of items was the first concern, and we wanted to avoid repeating many questions that contain similar rules on the test. Another consideration was the item difficulty index from previous studies, including a previous study in Vietnam (Van Vo & Csapó, 2020). Finally, 20 items were chosen for the IR test in this study (see two item samples in Fig. 1). The raw scores for each incorrect and correct answer are recorded as 0 and 1, respectively.

Fig. 1
figure 1

Sample items on the IR test

Scientific Reasoning Test

We composed 18 items to measure SR and covered main tasks, such as conservation, classification, proportional reasoning and correlational reasoning. The test items contain content knowledge of basic concepts in science subjects in the secondary educational curricula in Vietnam. Most of the items were adapted and translated into Vietnamese from the original test by Korom et al. (2017). Two items were adapted each from the Lawson Classroom Test of Scientific Reasoning (Lawson, 2000) and the scientific reasoning test by Hanson (2016), and only one new item was composed by the authors. The items were modified into a multiple-choice format, with each item containing a correct answer and three distractors (Fig. 2). We used a multiple-choice format because it is the most popular type in measuring SR ability (Opitz et al., 2017) and the opportunity for precision increases when measuring a larger number of respondents with smaller effect sizes than those of other question formats (Schwichow et al., 2016). We also endeavoured to minimize the impact of the students’ reading ability levels by reducing the texts and using more visualized representations with figures, tables and graphs.

Fig. 2
figure 2

Sample items on the scientific reasoning test

Students’ Motivation Toward Science Learning Questionnaire

The questionnaire was adapted from the original SMTSL developed by Tuan et al. (2005). This adapted version contains 24 items that measure five motivational factors. The self-efficacy subscale measures the extent to which students believe they are capable of learning science. A student’s ability to apply various learning strategies to develop new knowledge is measured by the active learning strategies subscale. Science learning value considers students’ perceptions of the value of science courses in their daily lives. Students’ satisfaction with their performance in science learning is measured by the achievement goals subscale. Learning environment stimulation involves the curriculum, teaching by teachers and the learning environment, all of which motivate students to learn science.

Numerous previous studies have provided evidence that the SMTSL questionnaire is reliable and valid in cross-cultural contexts (e.g. Cavas, 2011; Chan & Norlizah, 2018; Dermitzaki et al., 2013; Shaakumeni & Csapó, 2018; Tuan et al., 2005).

Background Questionnaire

We adapted the student questionnaire from PISA 2015 (OECD, 2017c) and translated it into Vietnamese. The questionnaire was designed to collect data on the students’ background, parents’ education level and parental involvement. Additionally, the students were asked to rate their perceptions of their parents’ involvement as it relates to their level of support, engagement and interest in school activities. The students also reported their grades on subject tests in the previous term. The background questionnaire is the first section of our test instrument.

Procedure and Data Analysis

Two experts and three secondary school teachers were invited to review the draft test instruments before we conducted the field study. They helped us check for any language issues and review the relevant subject content of the items on the instruments. Three secondary school teachers revised and recommended the tests, as they saw the aims of the inquiry as being consistent with those of the current programme at the secondary education level in Vietnam. A pilot study was carried out with the final version among seven students in two public schools (one lower and one upper secondary school). We observed the students while they did the mock test. The items were then discussed and slightly adjusted before conducting the main study.

The testing process was administered in either paper-based or computer-based modes, depending on the particular conditions and timetables at the participating schools. The students were also asked to prepare some information (i.e. the marks they received on their science subject tests in the previous semester) the day before we conducted the study. Paper-and-pencil mode was used in the classrooms, and each student received a test booklet along with an answer sheet. For the online test, each student had a link and an individual code to access the Electronic Diagnostic Assessment System (eDia) (Csapó & Molnár, 2019). Two teachers provided technology support and observed the students during the testing process. The online instruments were operated via the University of Szeged servers.

As regards the reliability estimates of the instruments, we preferred the internal consistency indicator of McDonald’s omega (ω) and Cronbach’s alpha (α) in R package psych (Revelle, 2019). The omega values for the IR test, SR test and SMTSL and parental involvement questionnaires are 0.81 (α = 0.80), 0.65 (α = 0.64), 0.90 (α = 0.88) and 0.77 (α = 0.74), respectively, implying that they are acceptable in terms of internal consistency reliability.

We converted the raw scores from the tests and questionnaires which were the output parameters of the Rasch measurement in the Maximum Likelihood Estimation (MLE) scale (digits). We employed the Rasch model with ACER ConQuest software on dichotomous items for the tests (Adams & Wu, 2010a) and polytomous items with partial credit model (PCM) analysis for the questionnaire (Adams & Wu, 2010b). The cut-off criteria for an acceptable infit item ranged from 0.77 to 1.30 (Griffin, 2010). The unidimensional model in the Rasch measurement confirmed that all the items on the reasoning tests fitted the data quite well. The infit for individual items ranged from 0.87 to 1.12 on the IR test and from 0.91 to 1.10 (mean = 0.99, SD = 0.07) on the SR test. For the purposes of this investigation, we scaled the students’ performance on the SMTSL questionnaire with PCM in a unidimensional Rasch measurement, and its output was considered science motivation in general (Dermitzaki et al., 2013). The results showed that all the items fit the data well, with infit values ranging from 0.82 to 1.43 and only one item with an infit value higher than 1.3. Similarly, the parental involvement questionnaire was converted into MLE in the Rasch measurement, and the infit indices ranged from 0.90 to 1.16. Additionally, we used a differential item functioning (DIF) analysis to examine statistically invariant characteristics at the item level. The results showed that no DIF item was found on either of the reasoning tests (using the difR package, see Magis et al., 2010), the SMTSL questionnaire or the parental involvement questionnaire (using the R lordif package, see Choi et al., 2011) as regards administration modes.

The path analyses were carried out to explore the causal effects of understudied variables. As an extension of multiple regression, path analysis allows for the analysis of more complex models with several dependent variables with “chains” of impacts available. For instance, a single mediator model (Fig. 3) of the effect of X on Y involves a distribution of a direct effect relating X to Y (path c’) and an indirect effect on Y through a mediated effect of M. An indirect effect involves two paths: path a represents the effect of X on the proposed mediator, and path b is the effect of mediator M on Y, which partially results from the effect of X. The Sobel test is frequently used to test whether a mediator conducts the influence of an independent variable to a dependent variable (see Preacher & Hayes, 2008). One or some of the single-mediator models may be included in a path model.

Fig. 3
figure 3

A single-mediator model

To evaluate model fit in the path analysis, we referred to three main index criteria, as suggested by Hu and Bentler (1999), including the comparative fit index (CFI), goodness of fit index (GFI) and standardized root mean squared residual (SRMR). There was never a rejection of a correct model if the CFI was greater than or equal to 0.96 and the SRMR was less than 0.10. Specifically, CFIs above 0.94 and SRMRs below 0.06 indicate an excellent model fit, while CFIs between 0.90 and 0.94 and SRMRs between 0.06 and 0.10 suggest an acceptable model fit or marginal fit. GFI is also considered a primary indicator of model fit because it measures the fit in absolute terms. An adequate fit of the data would be indicated by consistently high CFI and GFI values. The path analyses were computed in the R lavaan package (Rosseel, 2012). Other packages in R program version 3.5.3 (R Core Team, 2019), such as psych (Revelle, 2019) and YaRrr (Phillips, 2016), were employed to analyse the results.

Results

Patterns of Performance in Different Grade Cohorts

Pirate plots were used to depict the performance of the participants in the different cohorts. A pirate plot, an improved version of the bar plot and box plot, contains four main elements: points in raw data, bar or line centre, bean density and band inference with a 95% confidence interval (or highest density intervals), so it provides more information than a traditional plot (see Phillips, 2016, 2017). The pirate plots in Fig. 4 illustrate the students’ achievement (MLE scale) on the (a) IR test, (b) SR test and (c) SMTSL questionnaire in each grade cohort.

Fig. 4
figure 4

Students’ performance across grade levels on a the inductive reasoning test, b the scientific reasoning test and c the SMTSL questionnaire

For the IR test, the students’ mean scores increased remarkably from the 6th (mean = 0.73, SD = 1.25) to the 8th (mean = 1.29, SD = 1.29) grades and on to the 10th (mean = 1.81, SD = 1.17), before dropping slightly in the 11th grade (mean = 1.59, SD = 1.32) (see Fig. 4a). In the 6th grade, although the mean score was lower than that of other grades, some of the students had high proficiency in IR. Both the highest- and lowest-performing participants were in the 8th grade, and the distribution of scores was balanced in two directions of distribution. On average, the 10th graders yielded the highest scores; most of the students earned more than 0 with the lowest standard deviation. The bean density in the pirate plot for the 11th grade was slightly different from that of the other grade groups. Several students achieved scores of around 3.0, but some students in the 11th grade were on the lowest-score list.

In the same trend in IR, the students in the older groups performed better than their younger counterparts on the SR test, except in the 11th grade (Fig. 3b). The 6th graders had an average score of 0.06 (SD = 0.84), some students received the lowest points, and the students in the 8th grade attained a mean score of 0.78 (SD = 0.92), a dramatic rise compared to the 6th graders. Most of the students who earned the highest scores were in the 8th and 10th grades. The 10th graders achieved the highest mean points (mean = 0.90, SD = 0.75), but this increasing trend seemed to reverse in the 11th grade (mean = 0.55, SD = 0.84).

In contrast, the students’ motivation toward learning science tended to drop slightly across grade levels (Fig. 3c). The 6th graders achieved the highest score (mean = 1.54, SD = 0.86), followed by the 8th graders (mean = 1.15, SD = 0.79). However, the students’ motivation followed the same pattern between the 10th (mean = 0.87, SD = 0.61) and 11th grades (mean = 0.84, SD = 0.88). Nonetheless, no student in the 10th grade fell into the lowest-score group, while some of the 11th graders scored very high or achieved very low marks on the SMTSL questionnaire.

Additionally, we employed ANOVA to examine the impact of grade levels on the students’ reasoning proficiency and motivation. The findings showed a significant difference between the grade cohorts on the IR test [F(3, 722) = 20.53, p < 0.001], the SR test [F(3, 722) = 29.52, p < 0.001] and the SMTSL questionnaire [F(3, 722) = 25.1, p < 0.001]. Tukey’s honestly significant difference (HSD) was implemented as the post hoc analysis to test significant differences in pairs of grades. As summarised in Table 3, the older groups achieved higher scores on the reasoning tests than the younger ones, except for the 11th grade. The 8th graders showed significant improvement compared to the 6th graders on both the IR and SR tests. The students in the 10th grade also achieved significantly higher scores on the IR test than those in the 8th grade, but no significant difference was found between these two grades on the SR test. Surprisingly, the 11th graders performed significantly lower on the SR test than the 10th graders. Generally, the students in all the upper grades were significantly less motivated than their juniors, except between the 10th and 11th grades, where the students showed the same level of science motivation.

Table 3 Tukey’s HSD in multiple comparisons between grade cohorts

Predicting STEM Achievement Across Grade Levels

In the present study, we proposed the exploratory variables, that is, inductive reasoning (IR), scientific reasoning (SR), science motivation, mother’s education level (ME), father’s education level (FE) and parental involvement in schooling (PI), as exploratory factors in predicting STEM achievement (STEM). We employed the mean score from four subject tests in mathematics, biology, chemistry and physics in the previous semester as an index of STEM performance. The value of these tests may range from 0.0 to 10.0 based on a 10-point scale that has officially been introduced in secondary schools in Vietnam. In this study, the results of these tests were collected using the self-report form in the background questionnaire. As suggested in the literature, we proposed a hypothesized model as demonstrated in Fig. 5.

Fig. 5
figure 5

A hypothesized model for predicting STEM achievement (STEM) via understudied factors: inductive reasoning (IR), scientific reasoning (SR), science motivation (SM), parental involvement in schooling (PI), mother’s education level (ME) and father’s education level (FE)

We employed the hypothesized model with empirical data from the individual cohorts to investigate relations between the understudied variables in each grade level. Figure 5 demonstrates the results of the path analysis at each grade level based on the hypothesized model. The findings showed that the values for the main index criteria for GFI and CFI (greater than 0.94) and SRMR (lower than 0.10) were acceptable. It is noted that the present study examined the explanation for variance using marginal models nesting current empirical data. In particular, there was a marginal model fit to the empirical data in the 6th- and 8th-grade models and an excellent one in the 10th and 11th grades. At the 8th-grade level, the model can explain up to 41% of the variance of STEM achievement, but only 6% of the variance of STEM can be predicted by the proposed model of the 6th-grade cohort. In the 10th- and 11th-grade groups, these prediction values were around 20% and 11%, respectively (Fig. 6).

Fig. 6
figure 6

Path models in each grade cohort with standardized coefficients for predicting STEM achievement across grade cohorts via relational factors, that is, scientific reasoning (SR), science motivation (SM), inductive reasoning (IR), parental involvement in schooling (PI), mother’s education level (ME) and father’s education level (FE). *p < 0.05, **p < 0.01, ***p < 0.001

Among the six proposed predictors, IR and SR had a clearly positive direct effect on STEM achievement across the four models. IR was a strong significant predictor of STEM achievement in the 8th grade (\({\beta }_{\mathrm{std}}=.412, p<.001)\) and the 11th grade (\({\beta }_{\mathrm{std}}=.304, p<.001)\), while SR significantly impacted the STEM achievement of the 6th graders (\({\beta }_{\mathrm{std}}=.183, p=.046)\), 8th graders (\({\beta }_{\mathrm{std}}=.217, p<.001)\) and 10th graders (\({\beta }_{\mathrm{std}}=.182, p=.008)\). SM also contributed to explaining students’ STEM achievement. However, this was not clear across the grade cohorts, and there was even a negative effect in the 6th-grade path model. PI showed the same pattern with SM in predicting STEM achievement, with a positive direct effect, although it was positive in the 8th and 10th grades and negative in the 6th and 11th grades. In general, parents’ education positively impacted their children’s STEM achievement in most models in the study. FE significantly explained the students’ STEM achievement in the 8th- (\({\beta }_{\mathrm{std}}=.253, p<.001)\) and 10th- (\({\beta }_{\mathrm{std}}=.347, p<.001)\) grade groups, but it had a minor direct effect on STEM achievement among the students in the 6th and 11th grades. Likewise, the students’ STEM performance was positively influenced by ME, but it did not appear to be as solid as the FE variables in the four models.

Furthermore, PI was a significant predictor of SM through four models. Therefore, a mediating role of SM was examined in terms of predicting children’s STEM achievement of PI. The results of the Sobel test suggested that PI had no indirect effect on STEM achievement via the mediator of the SM variable in all models. Similarly, parents’ education levels were associated with PI and SM, but the Sobel test showed that neither PI nor SM was a mediator of parents’ education in predicting students’ STEM performance. However, FE acted as an indirect predictor of STEM achievement via the mediator of the IR variable, based on the Sobel test (Z = 1.99, p = 0.047) in the 8th-grade cohort. Additionally, SR was not a mediating factor of SM in predicting students’ STEM achievement in the current data. The findings also showed a strong relation between the two reasoning variables and the parents’ education variable across the grade cohorts. However, the relationships between IR and SR and between FE and ME are the covariances in the path models, so we did not apply the Sobel test to investigate their mediated effects.

Discussion and Conclusions

The students’ performance on both the IR and SR tests increased gradually across grade levels. Specifically, it improved remarkably across the lower secondary grade levels, but the growth rate appeared to drop in the upper grade cohorts. The results are fairly consistent with previous studies on IR (Csapó, 1997; Díaz-Morales & Escribano, 2013; Molnár et al., 2013; Van Vo & Csapó, 2020) and SR (Ding, 2018; Kwon & Lawson, 2000; Tairab, 2015). Nonetheless, one inconsistent finding in the current study is that the rate of the decrease in the 10th and 11th grades was higher, while previous studies just recorded a slight rise between those grades. There are several possible factors behind this trend. It may stem from the socioeconomic and background factors of the 11th graders in the present study. Specifically, the education levels of the parents of the 11th-grade participants were lower than those of the other grade cohorts, as described in Table 2. These latent traits may impact the results because parents’ education was indicated among the educational resources for children at home (OECD, 2017a) and positively related to students’ cognitive abilities (Van Vo & Csapó, 2020). The other possibility is that the 10th graders seem to be more resourceful (e.g. with inquiry activities and experimental tasks), an observation which supports further enhancing students’ development in thinking skills in that grade because teaching and learning in 11th grade tend to focus more on preparing for the National High School Graduation Examination, which assesses content knowledge (Nhat et al., 2018). This exam is very important since the 12th-grade students are required to take it to enrol for university or college. However, more studies with longitudinal investigations need to be conducted in this area in future.

Interestingly, although the children achieved lower scores on the SR test than on the IR test, the developmental trend appears to be similar on both tests. This provides evidence of a close link between the two forms of reasoning skills, in which teaching reasoning through specific science content knowledge can contribute to developing students’ general thinking skills. The strong relation between the two reasoning skills was further examined through the path models. It is consistent with studies by Kambeyo (2018) and Korom et al. (2017), which demonstrate a strong, positive correlation between IR and SR test scores. The current study investigated the relationship between IR general thinking skills and SR in domain-specific science content in Vietnam. Thinking can be taught directly in specific courses or embedded in the regular school curricula within the framework of school disciplines (Csapó, 1999). In Vietnam, it is encouraged to integrate thinking skills into the teaching of individual disciplines. The study showed that the development of thinking skills can be monitored as part of the infusion approach. In contrast, students’ motivation toward learning science gradually dropped across the grade levels. These findings are partly consistent with previous research (e.g. Bouffard et al., 2001; Dorfman & Fortus, 2019; Józsa et al., 2017), but the changing trend seemed to reverse slightly in high school students.

The results of the path analysis showed that cognitive abilities in terms of inductive reasoning and scientific reasoning were significant predictors of STEM success. The findings are in line with previous studies (Coletta & Phillips, 2005; Lawson, 2000; Van Vo & Csapó, 2021b; Zimmerman, 2000). In agreement with other studies (e.g. Fan et al., 2012; Van Vo & Csapó, 2021a), parental involvement in schooling was closely related to students’ science motivation. However, science motivation and parental involvement were unclear in predicting students’ STEM achievement in the current study. This seemed partially inconsistent with other studies, which confirmed the important role of motivation (Kriegbaum et al., 2018) and parental involvement in predicting academic achievement (OECD, 2017a; Shapira-Lishchinsky & Zavelevsky, 2020). Conversely, parents’ education contributed meaningfully to the students’ STEM success. Specifically, the father’s education level predicted the students’ STEM achievement better than the mother’s education variable. This is because parents’ education level is an indication of the educational resources children have at home (OECD, 2017a). Furthermore, parents’ education had a positive link to parental involvement in schooling across grade levels but did not relate to the students’ motivation toward learning science, specifically in the younger students.

Furthermore, path analysis also showed that parental involvement in schoolwork was not an indirect factor in predicting children’s success in learning STEM subjects through the SM variable. These results are not in line with previous studies (e.g. Ganzach, 2000; Shapira-Lishchinsky & Zavelevsky, 2020). This may be attributed to the typical context in Vietnam, where parents tend to focus strongly on their children’s school performance and guide their children to follow the career they have laid out for them (Hoang et al., 2014; Phan, 2004). In the context of the study, Vietnamese students have been found to be highly motivated to learn science (OECD, 2017b; Van Vo & Csapó, 2021a), but this finding must be approached with caution because these students are under pressure from their parents, teachers, relatives and even friends to achieve consistently good performance (OECD, 2017a, 2017b). Thus, rather than placing excessive pressure on students and instilling fear in them, schools and families should focus on how to inspire their learning. It is important for parents and teachers to advise students to follow career paths that are suited to their individual talents rather than following current trends. This should be noted by school leaders, school advisors and school psychologists to provide a more supportive environment for students.

The present study has some limitations. First, the findings was drawn from cross-sectional investigation that may be biased due to the influence of different environments (Maxwell & Cole, 2007). In addition, socioeconomic differences and quality of schools could influence students’ science performance and motivation to learn science (De Silva et al., 2018), but the current study does not deal with such issues in depth. Furthermore, there is a possible risk of bias when indicators of STEM performance are based on self-reports, and with a small sample size, path models were carried out within marginal model fit to the current data for each cohort, thus potentially affecting the results from the path analysis. Consequently, the findings should be generalized with caution. Indeed, more large-scale investigations need to be conducted in the future. Moreover, our adapted instruments may not encompass the full scope of reasoning proficiency and science motivation. For example, we exploited four kinds of non-verbal tasks, but verbal tasks also play an important role in the learning process, which this study does not cover. Finally, this is the first time the SR test has been used to assess students in the Vietnamese context, so some items need to be revised for next-generation versions.