Keywords

1 Introduction

A constructivist view of learning supports the use of clear goal statements and success criteria, targeted feedback and student self-assessment (Muijs & Reynolds, 2017, p. 1; Sadler, 1989). This idea is in line with effective teaching research (Maulana et al., 2021). However, little contemporary evidence exists to support the view that students are genuinely involved in decision-making about their assessment tasks (Dorman et al., 2008). That is, forms of assessment and specific assessment tasks employed in schools are usually decided by teachers and administrators. Furthermore, even though reports like The Status and Quality of Teaching and Learning in Australia (Goodrum et al., 2001) have asserted that assessment is a key component of the teaching and learning process, teachers tend to utilize a very narrow range of assessment strategies on which to base feedback to parents and students. In practice, there is little evidence that teachers actually use diagnostic or formative assessment strategies to inform planning and teaching (Radnor, 1996).

There are conflicting views about the role and nature of assessment practices in education. Harlen (1998) advocates that teacher should use both oral and written questions in assessing student’s learning. While, experts (Dorr-Bremme & Herman, 1986; Stiggins, 1994) encourage alternative assessment strategies, such as teacher observation, personal communication, and student performances, demonstrations, and portfolios, for greater usefulness of evaluating students and informing classroom instruction. Tobin (1998) asserted that assessment can be used to provide opportunities for students to show what they know. Reynolds et al. (1995) argued that for effective learning to occur, congruence must exist between instruction, assessment and outcomes. This paper represents a context-specific investigation of this congruence.

An effective assessment process should involve a two-way communication system between teachers and their students (Black & William, 1998). Historically, teachers have used testing instruments to transmit to the student and their parents what is really important for the student to know and do. While this reporting tends to be in the form of a grade, the form and design of the assessment can send subtle messages on what is important. There has been a substantial amount of research into types of assessment but very little research into students’ perceptions of assessment (Black & William, 1998; Crooks, 1988; Plake, 1993; Popham, 1997) and how it relates to classroom learning environments.

2 Aim

The overall aim of the study was to investigate relationships among students’ perceptions of their assessment tasks, classroom learning environments, academic efficacy and attitude to science in years eight, nine and ten in Western Australia.

The objectives of this study were:

  1. 1.

    to provide further validation data on the instrument for accessing students perceptions of assessment tasks;

  2. 2.

    to investigate differences between students’ perceptions in terms of gender and year levels;

  3. 3.

    to investigate associations between students’ perceptions of their assessment tasks and their attitude to science and academic efficacy outcomes; and

  4. 4.

    to describe the form and design of assessment tasks used by exemplary science teachers.

3 Theoretical Framing

3.1 Use of Student Perceptual Data

Until the late 1960s a very strong tradition of trained observers coding teacher and student behaviors dominated classroom research. Indeed, it was a key recommendation of Dunkin and Biddle (1974) that instruments for research on teaching processes, where possible, should deal with the objective characteristics of classroom events. Clearly, this approach to research which often involved trained observers coding teacher and student behaviours was consistent with the behaviourism approach of the 1960s. The study of classroom psychosocial environments in the late 1960s broke this tradition and used student perceptual data. Since then, the strong trend in classroom environment research has been towards this high-inference approach with data collected from the teachers and students. Walberg (1976) supported this methodological approach where student learning involves student perceptions acting as mediators in the learning process. Walberg (1976) also advocated the use of student perception to assess learning environments because students seemed quite able to perceive and weigh stimuli and to render predictively valid judgments of the social environments of their classes.

3.2 Classroom Learning Environment

The notion that a learning environment exists which mediates aspects of educational development began as early as 1936 when Lewin (1936) recognised that the environment and the personality of the individual were powerful determinants of behaviour and introduced the formula, B = f(P,E). Since Lewin’s time, international research efforts involving the conceptualisation, assessment, and investigation of perceptions of aspects of the classroom environment have firmly established classroom environments as a thriving field of study (Fraser, 1994, 1998; Fraser & Wallberg, 1991). For example, classroom environment research has focused on constructivist classroom environments (Taylor et al., 1997), cross-national constructivist classroom environments (Aldridge et al., 1999), science laboratory classroom environments (McRobbie & Fraser, 1993), computer laboratory classroom environment (Newby & Fisher, 1997) computer-assisted instruction classrooms (Stolarchuk & Fisher, 1999) and classroom environment and teachers’ cultural back grounds (Koul & Fisher, 2006).

A great deal of classroom learning environment research has been carried out over the past 40 years and evidence from these studies reveals that classroom learning environment dimensions are good indicators of teaching and learning processes and have predictive power on a number of learning outcomes pointing towards the possibility of improving students’ outcomes through changing classroom environments (Fraser, 1994, 1998; Fraser & Wallberg, 1991; Wubbles & Levy, 1993). The present interpretive study involved a multi-method approach in exploration of factors associated with students’ perceptions of assessment.

3.2.1 Attitude to Science Classrooms

The impact of students’ attitudes towards their science assessments is regarded as an important goal in the present study. Attitudes towards science, has been defined as “a learned disposition to evaluate in certain ways objects, people, actions, situations or propositions involved in learning science” (Gardner, 1975, p. 2). This learned disposition refers to the way students regard science, such as interesting, boring, dull or exciting. Positive student attitudes are then measured by the degree of motivation and interest reported by the students. Klopfer (1971, 1976) went further and developed a structure for evaluating attitudes related to science education. He included four categories in his structure: events in the natural world; activities; science; and inquiry. Klopfer’s (1976) second category, relating to students attitudes towards their science assessments was a focus of the present study.

3.2.2 Academic Efficacy

Over the past two decades the broad psychological concept of self-efficacy has been a subject of interest (Bandura, 1997; Schunk, 1995). Within this field, one particular strong area of interest is that of academic efficacy, which refers to personal judgments of one’s capabilities to organize and execute courses of action to attain designated types of educational performances (Zimmerman, 1995). Research studies have provided consistent, convincing evidence that academic efficacy is positively related to academic motivation (e.g., Schunk & Hanson, 1985), persistence (Lyman et al., 1984), memory performance (Berry, 1987), and academic performance (Schunk, 1989).

3.2.3 Gender and Year Level

It is well-documented in reviews of literature that women are under-represented in science and technology courses and careers (Commonwealth of Australia, 2019; Greenfield, 1996; Kahle & Meece, 1994) and that boys outperformed girls in science (especially physical science) (Casad et al., 2018; Bellar & Gafni, 1996; Kahle & Meece, 1994; Murphy, 1996). Among the sources that may cause these differences are individual, cognitive, attitudinal, socio-cultural, home and family, and educational variables (Farenga & Joyce, 1997; Kahle & Meece, 1994). In the classroom context, boys and girls may not have equal opportunities in science activities, and this could cause gender differences in science achievement (Fraser et al., 1992; Harding, 1996; Warrington & Younger, 1996). Because educational variables are one of the important sources for accounting for gender differences in students’ achievement in science, and for participation in science activities, the perspective of gender differences needs to be understood. Previous studies have reported gender-related differences in students’ perceptions of the learning environment (Fraser et al., 1996; Koul & Fisher, 2006). Therefore in keeping with these lines of research, gender-related differences in students’ perceptions of their assessment were explored in this study.

Year level as well as gender differences in students’ perceptions, other learning environment research studies in science classrooms have indicated differences between perceptions of students in different years of school (Kim et al., 2000). In this study, differences between the perceptions of students in different years of lower secondary were examined for trends.

4 Instruments and Procedure Used

The study was carried out in phases over a period of three years using a multi-method research approach:

  1. 1.

    In the first phase a pre-existing and validated questionnaire, Perceptions of Assessment Tasks (PAT) a six-scale instrument of 55 item developed by Schaffuer et al. (2000) was administered to 470 students from grades eight, nine and ten in 20 science classrooms in three Western Australian schools. Students in this study were between the ages of 12–15 years. Close ended interviews were conducted with randomly selected 40 students to look at student perceptions of their assessment tasks.

  2. 2.

    In second phase based on internal consistency reliability data and exploratory factor analysis, refinement decisions of PATT resulted in a five-scale instrument that was named the Student Perceptions of Assessment Questionnaire (SPAQ). This study was part of a larger study carried out in three states of Australia. The SPAQ was used with an attitude scale, and a self-efficacy scale. This survey was administered to a larger sample of 960 students from 41 science classes from the same grades as in the first stage.

  3. 3.

    In the final stage of the study five teachers identified on the basis of students showing most positive perceptions on the scales of SPAQ were interviewed and their teaching observed. Informal interviews were also conducted with students from the classes identified.

Students’ Perceptions of Assessment Questionnaire (SPAQ) Students’ perceptions of assessment were assessed with the 30-item SPAQ. These items are assigned to internally consistent scales namely Congruence & Planned Activity, Authenticity, Student Consultation, Transparency and Diversity. Table 15.1 shows the scales, descriptions and sample items from the SPAQ. Validation statistics performed on the data collected are presented in the results section. Responses in the SPAQ were recorded on a four point Likert type response format for each item (e.g., Almost Never, Sometimes, Often, and Almost Always).

Table 15.1 Description and example of items for each Scale of Students Perceptions of Assessment Questionnaire (SPAQ), attitude scale and academic efficacy

Two outcome scales namely Attitude to Science and Academic Efficacy were also employed in present study. A review of literature revealed a large pool of science-related attitude scales. Of particular interest to this study is the Test of Science Related Attitudes (TOSRA) developed by Fraser (1978) to measure students’ attitudes towards their science classes. Fraser based the subscales of this instrument on Klopfer’s (Klopfer, 1976) taxonomy of the affective domain related to science education. Attitude to Science was assessed on a 8-item scale adopted from the Test of Science-Related Attitudes (TOSRA: Fraser, 1981). Responses were recorded on a four-point format ranging from 1 (Disagree) to 4 (Agree).

Perceived Academic Efficacy refers to students’ judgments of their ability to master academic tasks that they are given in their classrooms. A 6-item scale using items developed by Midgley and Urdan (1995) was used to assess perceived academic competence at science class work. Items were modified to elicit a response on academic efficacy in science. All items in the academic efficacy scale had a four-point response format with anchors of 1 (Disagree) and 4 (Agree).

5 Results

Results of the study are presented in lieu of each of the research objectives:

5.1 Objective 1: Validation Data on the Instrument for Accessing Students’ Perceptions of Assessment Tasks

A principal components factor analysis followed by varimax rotation confirmed a refined structure of the SPAQ instrument comprising of 30 items in 5 scales and 14 items in two outcome scales. All the 44 items had a loading of at least 0.40 on their a priori scales (see Table 15.2). The percentage of the total variance extracted with each factor is also recorded at the bottom of Table 15.2. The percentage of variance varies from 3.55% to 26.03% for different scales, with the total variance accounted for being around 50%.

Table 15.2 Factor loadings for the questionnaire used in the study

The validity and reliability information of the instrument developed in this study are presented in Table 15.3.

Table 15.3 Scale Mean, Standard Deviation, Internal Consistency (Cronbach Alpha Reliability) and ability to differentiate between classrooms (ANOVA Results) for the SPAQ, attitude to science and academic efficacy

To determine by the degree to which items in the same scale measure the same aspects of students’ perceptions of assessment tasks, attitude to science and academic self-efficacy, a measure of internal consistency, the Cronbach alpha reliability coefficient (Cronbach, 1951) was used. For the scales of SPAQ, the highest alpha reliability of 0.83 for the scale of Authenticity, and the lowest of 0.63 for the scale of Diversity was recorded. The scale of student attitudes to science has alpha reliability score of 0.85 and scale of Academic Efficacy of 0.90. Since all the reliabilities for the scales of SPAQ were consistently above 0.63 the instrument developed is therefore reliable for use (DeVellis, 1991).

High mean scores ranging from 2.16 for the scale of Student Consultation to 3.17 for the scale of Congruence with Planned Learning on a four-point Likert type scale confirm that students generally have a positive perception of their assessment tasks. Scale of Student Consultation having the lowest scores confirms that students generally do not have a say in their assessment tasks.

Overall culture of each class is different and the ability of SPAQ to differentiate between the classes in the study was considered important. The instruments’ ability to differentiate in this way was measured using one-way analysis of variance (ANOVA). The eta2 statistics was calculated to provide an estimate of the strength of the association between class membership and the dependent variables as shown in Table 15.3. The eta2 statistic for the SPAQ, indicates that the amount of variance in scores accounted for by class membership ranged from 0.12 to 0.28 and was statistically significant (p < 0.001) for all scales. It appears that the instrument is able to differentiate clearly between the perceptions of students in different classrooms.

5.2 Objective 2: Differences Between Students’ Perceptions in Terms of Gender and Year Levels

5.2.1 Gender Differences

Differences between the students’ perceptions of the scales of the SPAQ and the gender of the students were analysed. The gender differences in students’ perceptions of classroom learning environment were examined by splitting the total number into female (388) and male (572) students involved in the study.

To examine the gender differences in students’ perceptions of the classes, the within-class gender subgroup mean was chosen as the unit of analysis as this aims to eliminate the effect of class differences due to males and females being unevenly distributed in the sample. In the data analysis, male and female students’ mean scores for each class were computed, and the significance of gender differences were analysed using an independent t-test. Table 15.4 shows the scale item means, male and female differences, standard deviations, t-values and Cohen’s d effect size. The purpose of this analysis was to establish whether there are significant differences in perceptions of students according to their gender.

Table 15.4 Item mean and standard deviation for gender differences in students’ perceptions on the scales of SPAQ

As can be seen in Table 15.4, out of five scales of the SPAQ and two Attitude scales, the gender differences in the perceptions of males and females were found to be statistically significantly different only on the scale of Authenticity. The result indicates that Authenticity was reported higher by male compared to female students.

5.2.2 Year Level Differences

One of the aims of the study was to investigate the differences in the perceptions of the scales of SPAQ and the two sides of attitude and efficacy in students from different year levels. This was explored by splitting the students in their year groups (year 8 = 347, year 9 = 328, year 10 = 285).

The results of the analyses are shown in Table 15.6. In the data analysis, mean scores for each of the three-year groups were computed. Table 15.5 shows the scale item means and F values of the scales of the SPAQ with the perceptions of students from the three year groups in study. The purpose of this analysis is to establish whether there are Significant differences in the perceptions of students according to their year groups.

Table 15.5 Item Mean, Item Standard Deviation and ability to differentiate between levels (ANOVA results) for year level differences in students’ perceptions measured by the SPAQ

As can be seen in Table 15.6, the differences in the perceptions of students on the scales of SPAQ and Attitude, five out of seven scales are statistically significant confirming that year level does impact significantly on students’ perception of their assessment. Tukey’s post hoc test (p < 0.05) revealed that for the Congruence with Planned Activity scale the Year Eight students were dominant and had statistically significant higher means while the Year Ten students had the highest means for the scale of Diversity.

Table 15.6 Associtations between scales of SPAQ and attitude to science in terms of simple correlations (R), multiple correlations and standardized regression coefficient (β)

5.3 Objective 3: Associations Between SPAQ and Attitude to Science and Academic Efficacy

One of the aims of the study was to investigate associations between students’ perceptions of assessment tasks and their attitude to science classes. These associations were explored using simple and multiple correlation analyses. The results of the analyses are shown in Table 15.4. For all the scales of the SPAQ associations are positive and statistically significant.

It was found that the scales of Congruence and Planned Activity, Authenticity, Transparency and Diversity were positively and significantly associated whereas, scale of Student Consultation was negatively and significantly associated with attitude to science.

The multiple correlation (R) between the set of SPAQ scales and attitude to science class was 0.55. The R2 value which indicates the proportion of variance in attitude to science class that can be attributed to students’ perceptions of their assessment tasks given by the teachers was 30%. To determine which SPAQ scales contributed most to this association, the standardized regression coefficient (β) was examined for each scale. It was found that the scales of Congruence and Planned Activity, Authenticity, Transparency and Diversity were positively and significantly associated whereas, scale of Student Consultation was negatively and significantly associated with attitude to science.

5.4 Objective 4: Describe the Form and Design of Assessment Tasks Used by Exemplary Science Teachers

Based on the findings of the quantitative data five exemplary teachers (three male and two female) were identified from the total sample of 40 and their teaching observed and informal interviews conducted. These five teachers represented Private, Public and Rural schools in Western Australia. These selected teachers had been rated by their students’ more than one standard deviation above the mean for at least three of the five scales. This process has been described previously by Waldrip et al. (2009).

Furthermore, four students from the classes of each of the five selected teachers also were interviewed. The students’ interviews were structured and conducted in three phases on the same day. The interview phases occurred before, during and after an activity in the classroom. Similar questions regarding the activity were asked to assess students’ initial perceptions about the task, during the task and when the task was completed.

The students were asked few general questions followed by questions relating to each of the five scales of SPAQ questionnaire. This approach enabled the researcher to draw on a variety of paradigms to inform their interpretation in a bid to explain the positive student perception of assessment tasks. The interview schedule along with stages and scales is represented in Table 15.7.

Table 15.7 Student interview schedule

The results which emerged from the interviews with teachers and students are presented in the next section.

Learning and Assessment

Interviews and observations reflected that the exemplary teachers were engaging constructivist ways of teaching underpinning formulations of formative assessment (Sadler, 1989). As supported by the quantitative results, students of these teachers had very positive perceptions of the assessment practices employed by their teachers and it was observed that social interactions within these classes were generally very strong. Assessment practices employed by these teachers not only look at what students know, but also at developing student identities as capable and competent learners. These teachers take into consideration what, why, and how students are learning as well as showing a shift in their views of assessment in science by keeping themselves informed on the changing nature of the outcomes of the science education. Some of the comments supporting these claims are:

Teacher:

I formulate assessments very early in the year keeping science intended outcomes in view. My assessments are designed to let me know what students know, not what they do not know. Thus, assessment becomes a part of learning.

Student 1:

Beginning of each term he gave us details of all the assessment and what is expected of us. This approach gives us clear guidelines for learning. After the assessment is evaluated, often, he runs a session on our misconceptions.

Student 5:

We knew in the beginning of the term that we are required to make an information poster or pamphlet or flyer regarding the infectious diseases. I kept on collecting the related information and stuff you know… It was easy to compile all the information close to the date of submission

Student 8:

Since we know what is required and even expected from us…we learn accordingly.

Student 11:

If I decide that I want good marks, we have to work for it. I cannot say what the mark should be but, If I have worked according to teachers guidelines I am sure that my work will get a high mark.

Curriculum and Assessment

The teachers when interviewed commented on the way they considered assessment and curriculum to be related and interact in complex ways. They believed that a well perceived curriculum that incorporates assessment also narrows the gap between intended and implemented curriculum resulting in an achieved curriculum. Exemplary teachers also researched and used the available relevant assessment resources. Typical of their comments were:

Teacher:

I do not separate assessment from the curriculum. Both are different but lead to same object-student/teacher learning. It is complex but once understood can be practiced successfully. These days there are lot of ideas and materials available.

Student 4:

She tells us what will be asked to do. So we prepare accordingly. She also gives us an evaluation criterion for each assessment.

Student 9:

You know this was different. I exactly knew what is required in this project. It turned out to be the biggest project I had ever done.

Student 14:

The last work sheet he gave us was confusing to start with. Lot of application mixing topics in machines, light and heat. I thought about it and did well.

Classroom and Assessment

The exemplary teachers believed that there is a need to recognise the roles and responsibilities of both teachers and students. This view resonates with Sadler’s (1989) view that formative assessment is based on the principle that students need to become consumers as well as the objects of assessment activities. This sociocultural view of learning enhances positive classroom interactions. Assessments also reflect a power relationship in classroom. The teacher questions and students respond. However, in an exemplary teacher’s class, teacher provides enough resources for students to respond to the questions and create knowledge. These resources could be books, the World Wide Web, peers or other resource persons.

Teacher:

I provide many resources to students so that they can research and find answers to the investigations we do. It is interesting to see how many resources students find on their own and enter classroom with different world views.

Students 3:

Teacher directs us to the reading material. We also do lot of web surfing. I find many useful links on YouTube.

Student 7:

Last night when I was chatting with my friends on Facebook [internet interaction site] we looked at viruses, bacteria, protozoa, worms and fungi. That was cool. We all learnt a lot about the lesson we are doing in class.

Student 10:

First, I thought we are not going to learn much in this year’s science unit. It seemed he was boring. I had not done much research. Now that we have started researching and we find the importance of substances like the mining in up north. We get the crude material and useful things come out of that.

Teachers and Assessment

Although these selected teachers had emancipatory views about assessment and stood apart generally from their counter-parts, they were feeling concerned about the external influences on them. They felt answerable to various stake holders namely students, parents, administrators and the community at large. To establish their accountability their students had to perform well in national and international science tests. They could use these test results as evidence of efficiency for their performance. The teachers also believe that knowledge and expertise of various assessment activities is mandatory for all science teachers who need to have an in-depth understanding of the topic being taught and that students’ existing knowledge. The exemplary teachers recommend that this can be achieved through planning of the course content which should include teaching, learning, assessment and curriculum and their interrelationship.

Teacher:

I feel responsible for student learning. I am answerable for the student learning and on top of that we have science Olympiads, national testing and international testing. It is complex.

Student 2:

He knows his stuff well and also how to teach. For example last topic on renewable energy he talked about many ways, how energy can be renewed and also conserved. It was great. I enjoyed the lesson and writing the project. With the result I got good grade.

Student 16:

She is through with the content of all the lessons.

Student 15:

Later this year we will be writing the international science test and she wants us to do well in that. This is a science extension class and many of us also participate in science Olympiads.

Students and Assessment

The final and last section of this study identified the students as active and intentional participants in classroom assessment practices. Cowie (2005) highlights the multiple consequences of classroom assessment for students as: importance of trust and respect; the influence of their goals and learning motivations, and equity issues. Our study also found parallels with each of these factors. Continued teacher support and positive classroom learning environment contribute towards what students consider important to learn. Mutual trust and respect among teachers and students is central to student learning. Students should believe that assessments are designed to help them and they view assessment as a joint teacher-pupil responsibility.

Teacher:

I have to be very careful about what I speak in classroom. I try to look at students positive points and build on that. I tend to add plurality in the assessments we (students and I) design. This gives all students from different cultural backgrounds and ability levels to demonstrate their learning. It also keeps them interested in science.

Student 6:

What I love about our teacher is the respect and belief she has for us. She designs assessments which she is confident that we have learnt and can do well. Last assignment when she thought that I could improve upon it, she talked privately and respectfully to me. I am learning, and that is her job.

Student 18:

During the question/answer session every student has equal chance of being asked for a response. He will only ask those students who have raised their hands. In the class (while teaching) he never shows individual preference.

Student 12:

We are free to do our assignments the way we want. We don’t get a choice on the things where teacher has already planned an activity and if we change it would affect our learning.

6 Discussion and Conclusion

This study further validated an instrument the Students’ Perception of Assessment Questionnaire (SPAQ) for use in educational settings. The three stage data collection facilitated gaining in-depth insights into students perceptions of assessments and how students felt assessment as an integral part of learning and playing significant role in teacher and student behaviours in the classroom (Cowie, 2005). The questionnaire using student perceptual data (Walberg, 1976) scales showed an acceptable factor loading with 30 items in five scales and Cronbach alpha reliability scores ranged from 0.63 to 0.83, (DeVellis, 1991), thus making these scales acceptable for use in future. Study made use of the student perception of assessment tasks added to the existing paucity of research in this area (Black & William, 1998; Crooks, 1988; Plake, 1993; Popham, 1997).

Of five scales of the questionnaire lowest mean score was recorded for the scale of Student Consultation which confirms that students generally are not consulted when deciding about the types of assessments and are not involved a two-way communication between teachers and students (Black & William, 1998). The SPAQ’s ability to distinguish between classes was also established, which was an important contribution of the study. Additionally, scales of attitude to subject and academic efficacy were further validated. High mean scores for scale of attitude to Science describe students positive attitude towards science assessments and is in tune with Klopfer’s (1976) second category of structure for evaluating attitudes. Students also demonstrated very high perception of academic efficacy confirming that these students will have high academic motivation (Schunk & Hanson, 1985) persistence (Lyman et al., 1984), memory performance (Berry, 1987), and academic performance (Schunk, 1989).

For gender differences statistically significant differences were found only on one scale of Authenticity at p < 0.05 and for all other four scales of the SPAQ and two attitudinal scales no statistically significant differences were recorded. These findings are in conflict with earlier research claims that boys outperformed girls in science (especially physical science) (Casad et al., 2018; Bellar & Gafni, 1996; Kahle & Meece, 1994; Murphy, 1996). This could be place specific where in equal opportunities were being provided to all students in the classroom irrespective of their gender (Fraser et al., 1992; Harding, 1996; Warrington & Younger, 1996). As opposed to results of gender differences for all the scales of the questionnaire statistically significant differences were reported for year level differences, with higher mean scores for Yr 8’s and lowest for Yr 10’s. The trends of year level differences synchronise with the findings from similar studies (Kim et al., 2000; Koul & Fisher, 2006).

It was found that student perceptual data can be used to identify exemplary teacher and SPAQ was a valid instrument to use for this purpose. The exemplary teachers were identified as those who scored more than one standard deviation above the mean for at least three of the five scales of SPAQ. This resonates with the constructivist view of learning wherein target assertions are clear-cut, students are provided with focused feedback and they are also involved in self and peer assessments (Maulana et al., 2021; Muijs & Reynolds, 2017, p. 1; Sadler, 1989).

Qualitative data added a new rich layer of understanding to already existing knowledge gained through quantitative data. While developing the SPAQ different dimensions of assessment were identified namely, Congruence with planned learning, Authenticity, Student consultation, Transparency and Diversity were identified. Observations and interview data identified the same dimensions existing within different sections of assessment process. The identified sections namely, learning, curriculum, classroom and assessment, teacher, and student are integral part of assessments. The identified exemplary teachers were engaging constructivist ways of teaching underpinning formulations of formative assessment (Sadler, 1989). The qualitative data identified the importance and role of involving students in assessment task leading to their learning.

Assessment for learning has emerged as central theme in this study. Identified exemplary teachers were found to be very thorough in their teaching, giving students enough time to prepare for an assessment, allowing students freedom to choose from a variety of assessments and were flexible in teaching and assessment. They also demonstrated an in-depth understanding of science topics they were teaching.

This study demonstrates that scales of learning environment can be used in complex studies where many interrelated variables are assessed. By identifying good science teachers and describing what they do in their classrooms, we have an opportunity to use this information in professional development of other interested teachers. This is one of the ways to bring about desired changes in the educational system.