Introduction

Within current education circles and amongst policy makers, there is increasing concern about engaging students meaningfully in mathematics learning in light of falling mathematics enrolments in the senior years of schooling. Research suggests that students’ choices are influenced by their attitudes towards and performance in mathematics, which are shaped by their school mathematics experiences (Hine, 2018; Nardi & Steward, 2003; Norton, 2017). It is argued that classroom environments in secondary school settings, particularly during the transition from primary to secondary school, generally are perceived less favourably by students because of changes in curriculum, pedagogy, assessment strategies, social interactions and student relationships (Attard, 2010, 2013, 2014). Changes in the learning environment during transition from primary to secondary school, the role of the teacher and the relative size of the school all influence student perceptions of their teachers and school experience (Ferguson & Fraser, 1998). A study of changes in learning environment during the transition from year 7 (primary school) to year 8 (secondary school) mathematics among 541 students in South Australia revealed a significant decline in involvement (Deieso & Fraser, 2019).

Thomson et al. (2019) have highlighted that the 2018 PISA studies show that Australian students performed below the international average in mathematics and that achievement stagnated over the previous 16 years. PISA data indicate that a high percentage of problems in mathematics classes are close repetitions of previous problems in primary school, which generally have low procedural complexity. Aditomo and Klieme (2020) argue that, after 30 years of research, rote learning and procedural mathematics, rather than problem-based activities, continue to hamper student progress in higher-order thinking tasks.

Research suggests that project-based mathematics can provide students with an opportunity to be active participants in their own learning process and to make meaningful connections between the content and the problem (Ahmad Tarmizi & Bayat, 2012; Chen & Kalyuga, 2020). Self-regulated learning relies on learners mastering the cognitive and motivational processes needed to steer their own learning outcomes (Velayutham et al., 2011). In higher education, the positive effects of project-based learning as a strategy have been well documented in various studies (Albanese & Mitchell, 1993; Calder, 2013; Carrabba & Farmer, 2018; Coliver, 2000; Gijbels et al., 2005; Mergendollar et al., 2006; Skilling et al., 2016), but research into the effectiveness of project-based learning in the high school context has been more limited and less conclusive (Carrabba & Farmer, 2018; De Witte & Rogge, 2012; Maxwell et al., 2001, 2005; Mergendollar et al., 2000).

The main purpose of the study reported in this article was to evaluate the comparative effectiveness of project-based and traditional mathematics instruction among students in their first year of high school. A secondary aim was to investigate whether project-based and traditional mathematics instruction were differentially effective for male and female students (Rijken, 2017).

As a backdrop to our investigation, the following sections provide a literature review in four important areas: the field of learning environments; the three student outcomes used in our study (enjoyment, academic efficacy and achievement); gender differences and the differential effectiveness of instructional methods for different genders; and project-based learning.

Literature review

Field of learning environments

The influence of the learning environment on the cognitive and affective outcomes of students in schools is well documented (Fraser, 2012, 2019, 2023a, 2023b). With most countries focusing heavily on academic testing to measure educational outcomes against international benchmarks, this focus on academic achievement needs to be complemented with attention to the classroom learning environment, teacher effectiveness and other aspects of student progress (Fraser, 2014; OECD, 2017). Classroom learning environment research was pioneered by Herbert Walberg who developed the Learning Environment Inventory (LEI) to provide criteria of effectiveness in evaluating Harvard Project Physics in terms of students’ perceptions of their classroom environments (Walberg & Anderson, 1968; Welch & Walberg, 1972). Walberg built on Lewin’s (1936) seminal work on field theory, which recognises both the environment and its interaction with characteristics of the individual as powerful determinants of human behaviour.

Walberg’s LEI proved to be the catalyst for the development and use of specific-purpose questionnaires for measuring the learning environment through the lens of participants’ perceptions. Examples include the Science Laboratory Environment Inventory (SLEI, Fraser, Giddings & McRobbie, 1995; Fraser & Lee, 2009; Lee et al., 2020) for science laboratory classes, the Constructivist Learning Environment Inventory (CLES) for constructivist-oriented settings (Tadesse et al., 2022; Taylor et al., 1997), the Questionnaire on Teacher Interaction (QTI, Sivan & Cohen, 2023; Wubbels & Brekelmans, 2005; Wubbels & Levy, 1993) for assessing perceptions of student–teacher interactions in the classroom, and the Technology-Rich Outcomes-Focused Learning Environment Inventory (TROFLEI, Aldridge & Fraser, 2008) for outcomes-focused settings.

The What Is Happening In this Class? (WIHIC), which is the questionnaire that we chose for use in our study, currently is the most frequently-used learning environment instrument. The WIHIC has 56 items in 7 scales (Student Cohesiveness, Teacher Support, Involvement, Investigation, Task Orientation, Cooperation and Equity) and was validated among 1081 Australian and 1879 Taiwanese students at the junior high-school level (Aldridge et al., 1999). Extensive use of the WIHIC in many other countries (e.g. USA, Singapore, Korea, China, UAE, South Africa, Indonesia, India, Greece and Canada) supported its strong factorial validity and internal consistency reliability, as well as proving useful in a wide variety of research and practical applications (Fraser, 2019, 2023a, 2023b; Koul & Fisher, 2005; MacLeod & Fraser, 2010; Zandvliet & Fraser, 2004).

Significant research has focused on modifying the WIHIC for new contexts and languages. A version of the WIHIC for use among primary-school students in the Greek language (the G-EWIHIC) was developed and rigorously validated by Charalampous and Kokkinos (2017). In China, Cai and colleagues (2022) developed the New What Is Happening In this Class? (NWIHIC) by adding the two new scales of Differentiated Instruction and Ongoing Assessment and validated it with 2280 grade 5–9 students.

Walberg’s pioneering use of learning environment criteria in evaluating Harvard Project Physics was the catalyst for many researchers around the world to include learning environment instruments in their evaluations of a wide variety of educational innovations and programs. For example, in the USA, classroom environment was included among the dependent variables in evaluating teacher professional development (Pickett & Fraser, 2009), a science field study setting (Zaragoza & Fraser, 2017) and the use of anthropometric activities (Lightburn & Fraser, 2007). The classroom environment was a major focus in the evaluation of outcomes-based education in both Australia (Aldridge & Fraser, 2008) and South Africa (Aldridge et al., 2006). When learning environment criteria have been used in evaluating flipped instruction among university students in the USA (Strayer, 2012) and Turkey (Polat & Karabatak, 2021), a more-positive learning environment was perceived in flipped classrooms.

As the COVID-19 pandemic precipitated widespread changes to online learning, researchers began to monitor the effects of these changes on students’ learning environments. Long and colleagues (2020) monitored changes to online learning among 230 preservice teacher education students in Texas, whereas McLure et al. (2022) compared students’ learning environment perceptions among Australian university students before and after lockdown. Both studies identified that the change to online learning was associated with a deterioration in the learning environment.

Learning environment constructs have been used as dependent variables in cross-national comparisons. In a comparison involving 1309 grade 7 and 8 mathematics students in the USA and HK, American students perceived their mathematics classes as having significantly more teacher support (0.56 SDs), cooperation (0.75 SDs) and equity (1.12 SDs) (Hanke & Fraser, 2022). However, Hong Kong students enjoyed their mathematics classes more than their American counterparts (0.38 SDs). A comparison of 1081 Australian with 1879 Taiwanese junior high-school science students revealed a similar pattern in which Australian students perceived a more favourable learning environment in terms of involvement and equity, but Taiwanese students enjoyed their science classes more (Aldridge et al., 1999).

Rogers and Fraser (2022) evaluated the effectiveness of different frequencies of science practical work (namely, at least once a week, once every 2 weeks, or once every 3 weeks or more) in terms of the learning environment perceptions of 431 middle-school students. Although more-frequent practical work was more effective than less-frequent practical work in terms of integration, open-endedness and material environment, further increasing the frequency beyond twice a week to at least once a week was associated with diminishing returns. Furthermore, increasing the frequency of practical work was not differentially effective for males and females.

Student outcomes: Enjoyment, academic efficacy and achievement

The three student outcomes in our evaluation were enjoyment, academic efficacy and achievement. Attitudes have been a central concept in social psychology for almost a century, with Allport (1935) noting that “this concept is probably the most distinctive and indispensable concept in contemporary American social psychology” (p. 43). The measurement and investigation of students’ attitudes have a long history and continue to be of contemporary interest internationally in both mathematics education (Aiken, 2002; Fennema & Sherman, 1976, 1978; McLeod, 1992; Tapia & Marsh, 2004) and science education (Khine, 2015; Saleh & Khine, 2011; Tytler & Osborne, 2012). McLeod (1992) defined attitude to mathematics as a construct representing an individual’s degree of affect, and Aiken (2002) noted that attitude towards mathematics is an emotional disposition that includes students’ likes and dislikes and their enjoyment during mathematics lessons.

In our study, we measured attitude to mathematics using the Enjoyment of Mathematics Lessons which is a modified version of a scale from the Test of Science-Related Attitudes (TOSRA, Fraser, 1977, 1981). The wording of items was modified to measure mathematics attitudes instead of science attitudes. For example, “Science lessons are fun” is changed to “Mathematics lessons are fun”. The five frequency response alternatives range from Almost Never to Almost Always. Studies that successfully used the Enjoyment of Mathematics Lessons scale include Hanke and Fraser (2022) and Robinson and Aldridge (2022).

Our study included academic efficacy as another student attitudinal outcome. Social theorists suggest that self-efficacy or belief and confidence in completing a task is important and can influence behaviours towards learning. High levels of self-efficacy can contribute to greater academic success (Bandura, 1977, 1997).

The Morgan-Jinks Student Efficacy Scale (MJSES) assesses students’ belief in their academic success (Morgan & Jinks, 1999), whereas the SALES questionnaire (Students’ Adaptive Learning Engagement in Science) assesses students’ self-efficacy as one of its four scales (Robinson & Aldridge, 2022; Velayutham et al., 2011). In our study, we used the 8-item Academic Efficacy scale found in Aldridge and Fraser (2008). This scale also has the same five frequency response alternatives ranging from Almost Never to Almost Always.

To assess mathematics Achievement, we used a 38-item multiple-choice test of knowledge and skills associated with number, space, measurement, chance and algebra based on the Australian national standards for mathematics for year 8 students and guided by tests designed by the Australian Council for Educational Research (ACER, 2005) (Fogarty, 2007; Lindsay & Stephanou, 2013).

Gender differences and differential instructional effectiveness for different genders

Although current research suggests that gender-related differences in mathematics achievement are not as prevalent as in the past, girls still express less confidence than boys in their ability to succeed at mathematics (Cheryan et al., 2017; Reilly et al., 2019; Thompson et al., 2012). Other negative attitudinal influences include girls judging mathematics as being less useful and girls’ attitudes as perceived by their teachers and parents (Fennema & Sherman, 1978). These findings are reinforced by Ma and Cartwright (2003) and McLure et al. (2022) who also argue that girls develop their attitudes towards mathematics and its utility independently of the school environment, context or climate.

For many decades, educational researchers have shown considerable interest in gender differences in students’ performance and attitudes in the subject areas of mathematics (Fennema & Sherman, 1978; Gallagher & Kaufman, 2005; Ma & Cartright, 2003), mathematics and science (Reilly et al., 2019); science (Danielsson, Auraamidou & Gonsalves, 2023; Weinburgh, 1995; Welch et al., 2014) and STEM (Cheryan et al, 2017; Koul et al., 2021; Masters & Meltzoff, 2020; McLure et al., 2022; Pelch, 2018; Su et al., 2009).

In our present evaluation of project-based mathematics, our interest was less in gender differences per se because it was considered more important to ascertain whether project-based and traditional mathematics were differentially effective for male and female students. Koul et al. (2018) provide a useful example of how they not only evaluated engineering and technology learning activities in terms of learning environment, attitudes and achievement, but also how they investigated the differential effectiveness of these materials for girls and boys. For a sample of 1095 grade 4–7 students, this study revealed that boys benefitted somewhat more than girls from the engineering and technology activities in terms of the learning environment scale of cooperation, but girls benefitted somewhat more than males in terms of understanding.

Project-based learning

Project-based learning is a strategy which is quite different from traditional classroom teaching, but its benefits have not been clearly established in secondary mathematics education. Relative to primary schools, classroom environments in secondary-school settings are generally perceived less favourably by students because of many changes in relationships, curriculum, pedagogy, and assessment (Attard, 2010; Fraser & Aldridge, 2017).

Explicit teaching of mathematical concepts using a procedural strategy, in isolation from problem solving and reasoning, is common in traditional secondary mathematics curricula (Stockard et al., 2018; Reigle-Crumb et al., 2019). Research also indicates that low-level procedural mathematics during transition can lead to a negative learning experience because students have generally achieved more-sophisticated levels of mathematics in primary school because of an integrated learning approach, leading to a loss of students’ confidence in mathematics, an attitude of disengagement and a regression in academic progress (Attard, 2010; Eppley et al., 2019; Oladayo & Oladayo, 2012).

Stacey (2010) states that research continues to show trends which are characterised by a ‘shallow teaching syndrome’ and an absence of reasoning as a proficiency in grade 8 mathematics. A high percentage of problems in mathematics classes are close repetitions of previous primary-school problems with low procedural complexity. Lithner (2007) argues that, after 20 years of research, rote learning and procedural mathematics, rather than problem-based or project-based activities, continue to hamper student progress in higher-order thinking tasks. Self-regulated learning relies on learners mastering the cognitive and motivational processes needed to steer their own learning outcomes (Velayutham et al., 2011, 2012) and is critical in the success of project-based oriented learning.

A project-based mathematics learning strategy is defined as research-based, open-ended, student-centred, student-driven and teacher-facilitated (Bell, 2010; Calder, 2013; Carrabba & Farmer, 2018). This instructional strategy can be effective for increasing learners’ motivation and retention of information as they engage in higher-order thinking skills to achieve their tasks (Schwartz et al., 2001). An evaluation of project-based learning in science among 2371 students in 46 Michigan schools revealed higher levels of collaboration, self-reflection and achievement among students receiving the intervention (Krajcik et al., 2003).

Project-based mathematics is characterised as group-based activity during which mathematical concepts are integrated into a project which is inquiry-based and requires students to examine numerous problems, utilise mathematics and present recommendations or solutions. An example of a project-based activity is students examining the development of a new suburb and utilising statistics to justify the inclusion of a range of services such as retail, health, education, leisure and transport. Students utilise census data and examine the viability of their inquiry and demonstrate their learning through presentations which utilise graphs and statistical tools.

Aims

The two aims of our study were to (1) evaluate the relative effectiveness of project-based and traditional mathematics in terms of seven learning environment scales and the three student outcomes of enjoyment, academic efficacy and achievement and (2) investigate the differential effectiveness of project-based and traditional mathematics for male and female students.

Research methods

This study utilised a mixed-method approach which combined quantitative data from learning environment and student outcome (enjoyment, academic efficacy and achievement) measures with classroom observations and student and teacher interviews. Creswell (2008) refers to the benefit of both quantitative and qualitative data working together to provide a better understanding of a research problem. Tobin and Fraser (1998, p. 639) “cannot envision why learning environment researchers would opt for either qualitative or quantitative data, and advocate the use of both in an effort to obtain credible and authentic outcomes”.

Some noteworthy learning environment studies have successfully combined quantitative and qualitative methods include Cho et al. (2023) who used a sequential explanatory mixed-method approach to investigate how autonomy-supportive learning environments improved the academic adjustment of 356 Asian international students. Aldridge et al. (1999) combined the use of the WIHIC with qualitative data-gathering methods to illuminate differences between Taiwan and Australia in the learning environments of 2950 junior-high science students. Numerous action research studies aimed at classroom improvement used a learning environment questionnaire in combination with qualitative interviews, observations and case studies (e.g. Aldridge & Bianchet, 2022; Bell & Aldridge, 2014; Aldridge & Fraser, 2008).

Assessment of learning environment

In order to evaluate the effectiveness of the project-based mathematics strategy in terms of the classroom learning environment, we chose the What Is Happening In this Class? (WIHIC) questionnaire, described above, which has been extensively validated and used in studies for different subject areas, age levels and numerous countries (Fraser, 2023a, 2023b). The 56-item seven-scale WIHIC assesses Student Cohesiveness, Teacher Support, Involvement, Investigation, Task Orientation, Cooperation and Equity (Aldridge et al., 1999). For our sample of 284 students, the Cronbach alpha reliability for the seven different WIHIC scales was high and ranged from 0.83 to 0.97.

Assessment of student outcomes

As noted above, to assess attitudes, we drew on the Enjoyment scale from the Test of Science Related Attitudes (TOSRA, Fraser, 1981) and modified items by replacing the term ‘science’ with ‘mathematics’. For the assessment of Academic Efficacy, we drew on research by Jinks and Morgan (1999) and adopted the 8-item efficacy scale used by Aldridge and Fraser (2008). We assessed mathematics performance with a multiple-choice progressive achievement test of knowledge and skills in number, space, measurement, chance and algebra based on ACER (2005). For our sample of 284 students, the Cronbach alpha reliability was 0.97 for Enjoyment, 0.93 for Academic Efficacy and 0.85 for Achievement.

Qualitative data collection

In this mixed-methods study, quantitative data were collected first and then, in order to support or refute patterns from the analysis of surveys, qualitative research methods (Lichtman, 2023; Merriam, 2009) were employed with project participants towards the end of the trial period. We attempted to understand the ‘lived experience’ (Sweetman et al., 2022) of participants as well as to identify ‘themes’ (Braun & Clarke, 2006) which can be defined as “abstract and subtle expressions/patterns/processes that explain a phenomenon (Mishra & Dey, 2022, p. 187). This mixed-methods approach also enabled ‘triangulation’ of quantitative and qualitative data and the embellishment and explanation of findings from questionnaires using qualitative information (Houtz, 1995).

Observations of three randomly-chosen project-based classes were undertaken and, towards the end of our study, interviews were conducted with a random sample of 28 project students (14 male and 14 female) from the seven project-based classes, as well and three teachers of project classes. Semi-structured one-on-one interviews were conducted with the three teachers, whereas students participated in focus-group interviews (Kvale & Brinkmann, 2009) in seven group of four students each (two male and two female). In collecting and analysing interview data, the work of Erickson (2012) was used as a guide.

Data sources

Over a period of 6 months, numerous projects were embedded into units of mathematics learning/teaching. Seven project-based classes followed the redeveloped project-based approach and assessment methods described previously. During the same period, the three comparison classes were taught the traditional mathematics curriculum, used a textbook and completed common tests and assignments. The teaching of the traditional approach followed the year 8 national curriculum framework for several topics (e.g. algebra, measurement, number, statistics and geometry). Different skills were taught individually and sequentially and assessed to ensure mastery (in contrast to the integrated project-based approach). With the traditional approach, mathematical rules and solving mathematical problems were emphasised.

The sample for the quantitative component of the study consisted of a total of 284 students in their first year of high school at a school in South Australia in their first year of high school. The project-based group consisted of 192 students (91 males and 101 females) in seven classes, whereas a comparison group of non-project-based learning consisted of three classes with 92 students (47 males and 45 females). Students’ ages ranged from 13 to 14 years.

All aspects of our study were approved by the relevant university human ethics research committee.

Data analysis

Given that the WIHIC questionnaire has had limited use previously in South Australia, we undertook a preliminary check of its validity with this population in terms of factorial validity and scale internal consistency reliability (alpha reliability coefficient).

We used MANOVA to test simultaneously our two main research questions concerning (1) the relative effectiveness of the two instructional methods (project versus traditional) and (2) the differential effectiveness of the instructional methods for males and females. The two independent variables were instructional method and gender, whereas the 10 dependent variables were the seven WIHIC learning environment scales and the three student outcomes (Enjoyment, Academic Efficacy and Achievement). We detected the presence of differential effectiveness for different genders by examining the statistical significance of the interaction between instructional method and gender for each dependent variable.

Results

Preliminary validation of WIHIC and attitude scales

To provide a preliminary check of the structure of the 7-scale 56-item WIHIC learning environment questionnaire, principal axis factoring with oblimin rotation and Kaiser normalization was conducted. However, prior to factor analysis, we conducted the Kaiser–Meyer–Olkin (KMO) test of sampling adequacy and Bartlett’s test of sphericity. Because the KMO test yielded a value greater than the cut-off of 0.5 recommended by Carny and Kaiser (1977) and the Bartlett test revealed a nonsignificant \({\upchi^{2}}\) value (Tobias & Carlson, 2010), the suitability of our data for factor analysis was established.

Table 4 in the Appendix shows the factor loadings for all items of the WIHIC. The criteria for the retention of any item were that it must have a factor loading of at least 0.40 with its own scale and less than 0.40 with every other scale in the WIHIC questionnaire. The application of these criteria led to the removal of one item from Student Cohesiveness and two items from Involvement. After removal of these three items, the a priori seven-scale structure of the WIHIC was replicated perfectly. The total proportion of variance accounted for by the seven WIHIC scales was 71.58%. Eigenvalues ranged from 1.43 to 21.55 for different scales, which satisfies Kaiser’s (1960) minimum criterion of 1 for meaningfulness. Overall, Table 4 supports the strong factorial validity of the final 53-item, 7-scale version of the WIHIC when used with first-year high school students in South Australia.

The internal consistency reliability of each WIHIC was calculated using Cronbach’s alpha coefficient. Alpha coefficients, which are recorded at the bottom of Table 4 for the seven WIHIC scales, ranged from 0.85 to 0.97 for different scales.

Instructional method and gender as determinants of learning environment and outcomes

Instructional method and student gender were investigated simultaneously by conducting a two-way MANOVA for the whole sample of 284 students with the seven WIHIC learning environment scales and three student outcomes scales as the set of 10 dependent variables. Instructional method and student gender were the two independent variables. The presence or absence of a statistically-significant interaction between instructional method and gender was used to signify whether instructional-method differences were different or similar for males and females. Initially conducting MANOVA for the entire set of 10 dependent variables reduced the Type I error rate associated with conducting separate univariate tests for individual dependent variables.

Prior to conducting MANOVA, the suitability of data analysis was checked using Levene’s Test of Equality of Error Variances to confirm that the error variance for each of the 10 dependent variables (classroom environment, Enjoyment, Academic Efficacy and Achievement) was equal across groups (instructional methods and genders). The F value from Levene’s test for each of the 10 dependent variables was statistically nonsignificant (p < 0.05), indicating that error variances were across the comparison groups for all scales and therefore that the data were suitable for analysis via MANOVA. Box’s Test of Equality of Covariance Matrices also was conducted separately for the set of 10 dependent variables to check whether the observed covariance matrices of the dependent variables were equal across the comparison groups. Statistically-nonsignificant results (p < 0.05) for the 10 scales suggest that, according to Huberty and Petoskey (2000), our data’s covariance matrices were adequate for conducting MANOVA.

For our sample of 284 students, the multivariate test using Wilks’ lambda criterion yielded a statistically-significant overall difference, F(10, 273) = 9.22, p < 0.01, with significant results for instructional method (F = 2.46, p < 0.01), gender (F = 2.16, p < 0.05) and the instruction–by–gender interaction (F = 2.10, p < 0.05). Therefore, we were justified in interpreting the two-way ANOVA results separately for each of the 10 dependent variables.

Table 1 provides the ANOVA results for instructional method, gender and the instruction–by–gender interaction separately for each learning environment and student outcome scale. The F value is provided for each dependent variable, in addition to the eta2 statistic (which represents the proportion of variance accounted for).

Table 1 Two-way MANOVA/ANOVA results (F and Eta2) for instructional-method and gender differences for each WIHIC, Enjoyment, Efficacy and Achievement scale

Table 1 shows that statistically-significant results emerged for: instructional method for one learning environment scale (Equity) and for Achievement; and for gender for one learning environment scale (Cooperation) and for Enjoyment of Mathematics. The instruction–by–gender interaction was statistically significant for one learning environment scale (Teacher Support) and for both attitude scales of Enjoyment of Mathematics and Academic Efficacy.

Instructional-method differences in learning environment and outcomes

Table 2 provides for each learning environment, attitude and achievement scale the average item mean, the average item standard deviation, and the ANOVA results repeated from Table 1. The average item mean is simply the scale mean divided by the number of items in a scale. It is useful for comparing the means of scales containing different numbers of items.

Table 2 Average item mean, average item standard deviation and difference between instructional methods (Cohen’s d effect size and ANOVA results) for each learning environment, Enjoyment, Efficacy and Achievement scale

As well, Table 2 provides an effect size for the instructional-methods difference for each scale. Cohen’s d is the difference between the means for the two instructional methods divided by the pooled standard deviation for each learning environment, Enjoyment, Efficacy and Achievement scale. The effect size conveniently expresses a difference between two groups in standard deviation units. According to Cohen (1988), effect sizes range from small (0.2) to medium (0.5) to large (0.8).

Table 2 shows that, for the 10 scales, mean scores were somewhat less favourable for the project group for five of the seven learning environment scales, more favourable for the project group for both Enjoyment and Efficacy and less favourable for the project group for Achievement. However, for most scales, these differences between project and non-project students were both small and statistically nonsignificant.

Relative to non-project students, project students perceived a significantly less positive classroom environment for Equity and had significantly lower achievement scores. For these two scales, effect sizes were 0.26 and 0.44 standard deviations, respectively, which are in the small to medium range according to Cohen’s (1988) criteria.

Gender differences and instruction–by–gender interactions

Table 1 provides ANOVA results for gender differences in the 10 learning environment and three student outcome scales. These gender differences were statistically significant only for one learning environment scale (Cooperation) and for Enjoyment of Mathematics. For these two scales, effect sizes were 0.37 and 0.15 standard deviations, respectively, which would be classified as relatively small according to Cohen’s (1988) criteria. Interestingly, for the two scales for which gender differences were statistically significant, females held somewhat more favourable perceptions and attitudes than males.

Table 1 shows that the instruction–by–gender interaction was statistically significant for one learning environment scale (Teacher Support) and for the two outcomes Enjoyment and Academic Efficacy. Table 3 provides the mean and standard deviation for each scale for four subsamples: project-based males, project-based females, non-project-based males and non-project-based females. Also, for each scale, the effect size (number of standard deviations) is shown for gender differences in scores separately for the project and non-project groups.

Table 3 Average item mean, average item standard deviation and gender difference (effect size) for two instructional methods for learning environment, Enjoyment, Efficacy and Achievement scales

The last column of Table 3 shows an interesting pattern in the sign/direction of gender differences for the project group relative to the non-project group. For most scales, males scored somewhat more highly than females in the project group, but females scored somewhat more highly than males in the non-project group. Although these differences typically were small in magnitude and the instruction–by–gender interaction was nonsignificant, still the pattern of results in Table 3 suggests that the project method was differentially effective for males and females, with males benefitting more from the project approach and females benefitting more from traditional (non-project) methods.

In order to interpret the statistically-significant instruction–by–gender interactions for Teacher Support, Enjoyment and Academic Efficacy, graphs are shown in Fig. 1. The interpretation of all three significant interactions is similar: the project method was differentially effective for male and female students, with males benefitting more from the project method and females benefitting more from the traditional non-project method.

Fig. 1
figure 1

Statistically-significant interactions between instructional method and gender

By simultaneously considering Table 1 and Fig. 1, the following interpretations emerge from the three scales (Teacher Support, Enjoyment and Academic Efficacy) for which a significant instruction–by–gender interaction was found:

  • Although overall Teacher Support scores were not significantly different either for different instructional methods or for different genders, males benefitted more from the project approach whereas females benefitted more from traditional methods.

  • Overall, Enjoyment scores were not significantly different for different instructional methods, but females overall reported significantly-higher Enjoyment than males. Nevertheless, females enjoyed mathematics more under the traditional non-project method, whereas males’ enjoyment of mathematics was similar under either instructional method.

  • Although overall Academic Efficacy scores were not significantly different either for different instructional methods or for different genders, males benefitted more from the project approach whereas females benefitted more from traditional methods.

Qualitative analysis

Following quantitative data collection, qualitative data were gathered to provide further insights into student perceptions and a deeper understanding of the effect of project-based activities on student outcomes. Perhaps the most noteworthy finding based on quantitative data was that project-based mathematics was considered more enjoyable than traditional approaches to mathematics. Although Academic Efficacy also was somewhat higher in project-based classes, Achievement in non-project-based classes was higher in than project-based classes. In addition, for three scales (Teacher Support, Enjoyment of Mathematics and Academic Efficacy), males benefitted more from the project-based approach whereas females benefitted more from the traditional approach. In explaining these findings, we analysed student and teacher interviews.

Our analyses focussed on the following questions: (a) What factors contribute to females enjoying the traditional approach more whereas males enjoyed the project-based approach more? and (b) How might these differences in Enjoyment explain why project-based mathematics was differentially effective for males and females in terms of Teacher Support and Academic Efficacy?

Qualitative analyses included focus-group interviews with a sample of 4 students (2 males and 2 females) randomly selected from each of the project-based classes (total of 28 students), as well as individual interviews with 3 randomly-selected teachers who taught the project-based curriculum. The interviews were semi-structured and explored the complete experience of project-based activities and student perceptions and opinions. The process was guided by open-ended questions (e.g. Can you describe an example of the project-based activity? Did you enjoy the activity and why? What was most challenging in this activity? Do you feel you have made progress in mathematics?). The interviewer solicited more-detailed explanations and perspectives about the learning environment and project-based learning.

After gathering the qualitative data, three important themes (Braun & Clarke, 2006) emerged: the nature of the learning environment in first-year high school; project-based mathematics as a strategy for successful learning; and student perceptions of the strategy and their experience.

Interviews and observations confirmed that the traditional learning environment in most classes at this high school was regulated, with students waiting outside their classroom before the teacher arrived and students moving into a space where desks were traditionally in rows. Students waited for the teacher to greet them before getting ready for the lesson. Class rules were enforced and there was a degree of formality. Students had access to their own personal laptop computers and utilised school software to access resources, communicate and collaborate securely online. Traditional classrooms were teacher directed and the structure of the lessons followed a template with a balance of teacher input, group work and individual work.

However, this regular routine was changed to accommodate the new strategy of project-based group work. Tables were re-arranged and the classroom became more dynamic and noisier. Groups were established using student choice based on friendship and all classes were primarily single sex. Students had access to a broad range of resources and equipment. Whilst students engaged with mathematics during five 45-min lessons each week, online collaboration meant that many students accessed the teacher outside formal class time. Because students were more at ease and familiar with teacher-directed projects, some students struggled with student-directed projects.

Interviews with three teachers from the seven project-based classes revealed that, whilst they were enthusiastic about and had high aspirations for this strategy, initially they struggled with the detailed preparation required and needed support with integrating projects with the requirements of the curriculum. Teachers felt that the use of technology-rich environments engaged and motivated students and provided collaboration and reflection on the learning process. Teachers had to scaffold project-based activities initially, but were able to allow more student freedom later. The biggest challenge for teachers was to ensure that activities were well differentiated for different ability groups. In the background and throughout the trial, teachers felt the need to be successful with the project strategy, but felt additional pressure from the school, students and parents to ensure that students were successful overall in their mathematics classes. Teachers felt additional pressure to meet the requirements of the curriculum because of the limited time allocation for completing project-based tasks. Teachers reported that some topics could not be covered. In terms of the final products from students, teachers were surprised by the quality of work produced by both male and female students. Many projects were highly innovative and showed evidence of higher-order learning.

In general, during the interviews, students reported general satisfaction with the project-based strategy. Many reported that it was a welcome change from the highly-regulated learning that they were encountering for most subjects in their first year of high school. In primary school, they had enjoyed the informal nature of their classrooms, the smaller class size, and lower emphasis on homework and academic success. The project strategy provided an opportunity for students to work with friends and choose a topic. They reported a range of positive and negative issues in terms of the success or otherwise of their project. Issues were commonly identified around productivity, equity and work quality. Higher-achieving groups were heavily involved, highly focussed and enjoyed the tasks; however, students who usually struggle with mathematics struggled even more with project-based tasks. Students reported favourably that their teachers supported them and were interested in their work. They felt that the teacher was able to listen and valued out-of-class collaboration using technology.

More-detailed analysis of the student interviews revealed that female students reported higher expectations than males about their mathematics: they valued organisation and perfection. Females struggled with the open-ended nature of the projects and found them difficult to connect to the application of mathematics. They sought meaning and tended to over-analyse problems. Most importantly, females voiced concern about their final grade and achievement. A number of female students perceived that they had not progressed in mathematics; in fact, one student felt that she had gone backwards. Boys, on the other hand, generally relished the freedom and opportunities to work with their friends. They enjoyed the connection to real-world mathematics and enjoyed the purpose. Boys reported issues with quality, effort and equity during group work, but they were able to overcome these issues. They were not overly concerned about their final grade, but the process and opportunity to do something different was highly valued.

Conclusion and significance

Within the field of learning environment, this mixed-methods study is unique because past research into the effectiveness of project-based learning in mathematics in the first year of high school in terms of learning environment, enjoyment, academic efficacy and achievement has been quite limited. Therefore, this research potentially could contribute both to the field of learning environments and to our understanding about project-based mathematics. Furthermore, our investigation of the differential effectiveness of project-based mathematics for boys and girls is novel and noteworthy and hopefully could provide new insights into effective pedagogy in mathematics for both boys and girls.

The main finding from the quantitative component of the study involving 284 students was that, relative to traditional mathematics classes, project students perceived a significantly less-positive learning environment for the Equity scale and had significantly lower Achievement. However, project-based mathematics proved to be differentially effective for male and female students, with males benefitting more from the project method and females benefitting more from the traditional approach for the learning environment scale of Teacher Support and for the student outcomes of Enjoyment and Academic Efficacy.

Qualitative methods revealed that students had difficulty with the unstructured and open-ended nature of project-based activities, but they enjoyed the change in activity from more-traditional procedural mathematics and textbook learning. Because project-based mathematics was new to students, working in groups provided significant challenges for many students. Qualitative data generally supported the quantitative findings based on quantitative data. Regarding the differential effectiveness of project-based mathematics for boys and girls, many girls questioned the purpose of project-based learning, found it confusing and difficult to make mathematical connections, and perceived it to be detrimental to their high expectations for academic progress. On the other hand, girls enjoyed this approach as a break from their traditional learning approach. Boys often reported that project-based learning was beneficial and enjoyable and provided involvement, challenge and choice of activity. Boys enjoyed the practical tasks and connections to new learning, and they were less concerned about academic progress.

Our study had some limitations that lead to suggestions for future research. In the light of challenges in gaining access to schools, our sample was limited to 284 students in 10 classes in the first year at one private school. Also the qualitative component involved only 28 students and three teachers. In future studies, a larger sample would enhance the statistical power of analyses and a broader composition (different schools, grade levels and socioeconomic situations) would improve the generalizability of results. Because the teachers involved had limited prior experience with project-based instruction, somewhat different findings might emerge in future research with teachers who are more experienced with project methods. In future studies of project-based methods, a greater variety of student outcomes (in addition to Enjoyment, Efficacy and Achievement as in our study) could broaden understanding of the impact of project-based mathematics. Because our study revealed noteworthy differences in the effectiveness of project-based mathematics for male and females, we encourage the future investigation of the differential effect of educational programs on the learning environment of students of different genders as in our study or of different ethnicities as in Long, Sinclair and Fraser’s (2020) study of alternative science sequences.