Background

Alternative assessment asks students to show what they can do, that is to say, students are evaluated on what they integrate and produce rather than on what they are able to recall (Macias, 1995, cited in Coombe et al., 2007). As one of the main forms of alternative assessment, peer assessment has gained much importance in educational learning and educational research. It is considered as "an arrangement in which individuals consider the amount, level, value, worth, quality, or success of the products or outcomes of learning of peers of similar status" (Topping, 1998, p. 250). It is "the process of having the readers critically reflect upon, and perhaps suggest grades for the learning of their peers" (Roberts, 2006, p. 80), and being judged for the quality of the appraisals made (Davies, 2006).

Assessment, in any instructional operation is critical; both teachers and learners need to get involved in and have control over the assessment methods, outcomes, and their underlying rationale (Cheng and Warren, 2005). When it comes to assessing students’ writings in EFL contexts specially in traditional teacher-centered classrooms, the incorporation of peer assessment as a learning tool (Lindblom-ylänne et al. 2006) besides the usual teacher assessment not only can change learners perspective toward various types of assessments, but may also lead to outcomes at least as good as teacher assessment and sometimes better (Topping, 1998). Being helpful, beneficial, enjoyable and challenging on the one hand, and feelings of threat or being unnerved due to the subjectivity of assessment, or failing to develop confidence in acting fairly as an assessor (Sambell et al., 1997) on the other hand are some attitudes toward peer assessment indicating that students’ levels of acceptability are varied (Topping, 1998). As "few student evaluations of peer assessment are reported" (Falchikov, 1995, p. 177), the findings reveal that studies on students’ attitudes to this practice are confused and inconclusive.

The impact of peer assessment on language learning is promising, but its efficacy seems to depend on many factors including students’ attitudes, language levels, familiarity with the assessing criteria, the type of skill being assessed, and the possible presence of bias such as gender and friendship. In line with previous studies, although not aiming at reviewing and replicating the extensive literature on peer assessment, this study was conducted to shed light on the status of peer assessment in an EFL context where teacher-centered classes are the norm. The differences between teacher and peer ratings as well as the existence of any friendship bias which has been meagerly dealt with in previous research are considered. Moreover, how this type of assessment may influence the perspective of learners at the tertiary level is examined.

Alternative assessment

The importance of assessment as an integral part of teaching-learning cycle is apparent to many educationalists. Assessment changed during the changes of the theories and models of learning. Constructive teaching and learning brought assessment in the center and it no longer has the purpose of presenting a form of measurement related to the traditional curricula. The teacher is no more the centre of assessment, but the students go hand in hand with teachers to apply such an interactive type of assessment (Wikstorm, 2007). According to modern theories of assessment which consider assessment as a part of learning and teaching process, rather than the end-of-course evaluation of student achievements, assessment is becoming the process of describing student's performance. Modern views of curriculum and constructivist learning theories looked for a new type of assessment capable of being used as a part of the instruction to help learners in the process of acquiring knowledge, which could lead to the promotion of students’ understanding. Based on the new developments in learning theories, teachers open up discussion of assessment with students; this is actually what presents a major challenge for assessment in 21st century because it is putting demands on the teacher to obtain specific skills needed for this new, additional role. The process of learning should be assessed by more intense, interactive methods and that work should be undertaken in collaboration, either between teacher and student or a group of peers (Wikstorm, 2007).

Coombe et al. (2007) propose several types of alternative assessment that can be used in today's language classrooms with great success: self-assessment, portfolio assessment, student-designed tests, learner-centered assessment, projects, and presentations. Similarly, Cheng and Warren (2005) believe that there are several approaches to classroom assessment such as performance assessment, portfolio assessment, self and peer assessment. They specify that teachers play a major role in traditional pen and paper and performance assessment, whereas self and peer assessments are more student-centered. They allow students to participate in the evaluation and provide opportunities for observation and modeling which help them scrutinize themselves and adjust their performance.

Peer assessment in EFL contexts

Surveying the literature in the EFL context, Cheng and Warren (2005) found that peer assessment has been more commonly incorporated into English language writing instruction where peers respond to and edit each others' written work with the aim of helping with revision. Some of the examples they cite include Hogan (1984), Birdsong and Sharplin (1986), Lynch (1988), Devenney (1989), Jacobs (1989), Rothschild and Klingenberg (1990) Rainey (1990), Bell (1991), Mangelsdorf (1992), Murau (1993), Caulk (1994), Mendonca and Johnson (1994) and Jones (1995).

Findings suggest that student writers selectively take account of peer comments when they revise, preferring to depend more on their own knowledge. Student writers may not always trust their peers, but the same comment from a teacher will be taken into account when they revise (Mendonca and Johnson, 1994). Reviewing the literature related to the outcome of studies on peer assessment of writing, Topping (1998) found that it "appears capable of yielding outcomes at least as good as teacher assessment and sometimes better" (p. 262). Mangelsdorf (1992) reports that peer reviews were always rated negatively by Asian students, and raises the question of the effect of teacher-centered cultures on the way students regard peer comments. However, the merits attributed to applying peer assessment cannot be ignored. Being an effective tool in both group and individual projects (Matsuno, 2009), encouraging reflective learning through observing others' performances and awareness of performance criteria (Saito, 2008), immediate support in the classroom, gains for both the assessor and the assessed, and being individualized and interactive (Black and William, 1998) are some benefits to consider.

Peer assessment in writing

Peer evaluation plays an important role in both first (L1) and second language (L2) writing classrooms, and allows writing teachers to help their students receive more feedback on their papers as well as give students practice with a range of skills important in the development of language and writing ability, such as meaningful interaction with peers, a greater exposure to ideas, and new perspectives on the writing process. It is obvious that peer involvement creates opportunities for interaction, and increases objectivity in assessment. If put in a situation where learners access information about the quality and level of their peers as well as their own performances, there is the possibility that they will be able to clarify their own understanding of the assessment criteria (either set by students themselves or by the teacher), and more importantly, of what is required of them (Patri, 2002). What seems to be important is that students must use clearly defined guidelines to evaluate each other's work, so checklists with lists of points to be assessed are very useful. Although the grades may be generated by students, "the teacher should … reserve the right to make adjustments if necessary" (Kearsky, 2000, cited in Roberts, 2006, p. 91). When students are trained on how to give and use feedback (Min, 2006), peer evaluation can be extremely effective. Teachers can incorporate it as a way to present writing skills to students, ideally creating a student-centered classroom with learners capable of critically evaluating their own written work. Peer review sessions can teach students important writing skills, such as writing to a real audience seeing ideas and points of view other than their own (Paulus, 1999), and discussing how to revise writing effectively.

Methods

The study

Given the importance of peer assessment and its impacts on language skills and considering the students' attitude towards it, the main research questions were formally stated as follow:

  1. 1.

    How similar are teacher and peer ratings of students' English compositions?

  2. 2.

    Do students favor peer assessment?

  3. 3.

    Does friendship affect peer rating?

Participants

The 26 homogenous subjects of the present study were selected from the initial 38 Iranian university students of English literature during the second educational semester of 2009. They were 24 females and 2 males, ranging in age from 19 to 27 with the mean age of 21, who were in their sixth term of study and were passing their essay writing course with the researcher.

Instruments

To come up with satisfactory results, some sets of tasks and tests were employed in this study:

The intermediate Nelson Language Proficiency Test (1977). A modified version of the original Intermediate Nelson Proficiency Test with the reliability of 0.83 in piloting was used. It consisted of two parts: a cloze passage and 40 discrete-point items.

The writing checklist. To score the subjects' compositions Jacobs et al.'s (1981, cited in Hughes, 2003) writing scale was used which rather follows an analytical (objective) procedure. According to this scale, five factors were considered in every composition: 1. Content, 2. Organization, 3. Vocabulary, 4. Language Use, and 5. Mechanics. All the subjects received 5 sub-scores (at most 4 for each part) and the total grade was 20.

Pre- and post- questionnaires. Two questionnaires were constructed and validated before being distributed among the participants by consulting the items with several experts in the field. One questionnaire was administered at the beginning and the other at the end of the writing course. These questionnaires were used to evaluate the subjects' attitudes toward peer assessment. They included 9 questions on a 3-point Likert scale with responses of yes, no, not sure in the pre questionnaire and yes, no, and to some extent in the post questionnaire. The reliability of each questionnaire was calculated and turned to be 0.7 and 0.76 for the pre- and post- questionnaires, respectively.

The writing tasks. In three successive sessions, 3 general topics were offered for participants to write about and to submit in the next sessions to be evaluated analytically by their peers and the class teacher (the researcher) and two other teachers.

Procedure

In order to achieve the desired results, the researcher undertook the following procedure. At the beginning of the semester, after students’ consent to take part in the study, the modified intermediate Nelson Proficiency Test was administered to the whole population. The descriptive report taken from SPSS about the mean and standard deviation of the scores were used to decide on the final homogenous group. In the following session, students filled in the first questionnaire about their attitudes toward peer assessment; they were also asked to write the name of three of their most intimate friends in the same class. The names were used to draw a sociogram and to analyze and display sets of relationships to discover mutual intimate friendship among students. Prior to the assessment program, this procedure led to the identification of the friend and non-friend peers who had to mark other students' compositions in the following sessions.

To mark the compositions, first, the ESL composition profile by Jacobs et al. (1981, cited in Hughes, 2003) was introduced. This profile consists of five traits which tap into different features of a written text by a set of descriptors corresponding to different quality levels. The five traits are content, organization, language use, vocabulary and mechanics and the maximum number considered for each was 4 point. In the following session, students were taught on how to use the profile in assessing compositions of their classmates and for 3 sessions they practiced assessing their peer's writings.

These practice sessions were followed by the actual peer assessment experience. Three general topics were assigned for the next three successive sessions, one at a time. For each meeting, students were required to hand in their compositions and five copies of it to be marked by the teacher, two peers and two other raters. As mentioned before, the researcher had already identified friend and non-friend peers of each student. Accordingly, names of peers (without mentioning their friendship relation) were read out so students knew whose papers they had to mark. After using the checklist to score the writing performances, the peer raters signed the papers and handed in the compositions and scoring tables to be recorded by the teacher. These papers were returned to the writers in the following sessions for discussion during which the teacher and peer corrections were reviewed and the subjects were given feedback regarding their errors in their writings and on parts which needed revision.

In addition to peer raters, the teacher (researcher) and two other EFL instructors assessed the writings. The researcher briefed the two raters on how to score the writings. The writing scores went under statistical analyses. In order to calculate the inter-rater reliability of the sets of scores given by the three raters, the coefficient correlation (Pearson Product Moment) was used. The coefficient alpha was computed and turned to be 0.89, 0.82, 0.90 for the first, second and third writings, respectively.

Finally, at the end of the course, to investigate any possible change in students attitude towards peer assessment a questionnaire similar to the one completed at the beginning of the semester was administered.

Results and discussion

In order to answer the first question about the similarity of teacher and peer-rating of students’ English compositions, paired-samples t-test was applied, once for the peer raters (friends and non-friends) as a whole, then separately for friends and non-friends. For the peers in general, the results, t = .827, P = .416 > .05, indicated that there was no significant difference between the teacher and friend and non-friend peer corrections/ratings, and the mean scores for corrections were quite close to each other (Table 1). Similarly, for the separate groups of peers, the results of the paired-samples t-tests did not reveal any significant differences between the teacher’s corrections and each of the peer groups. As displayed in Table 2, for friends corrections, t = .048 and P = .962 > .05, and for non-friends, t =1.685 and P = .104 > .05. The descriptive statistics for the three comparisons and the t-test results are presented in Tables 1 and 2, respectively.

Table 1 Descriptive statistics for teacher and peer corrections
Table 2 Paired-samples t-test for teacher and peer corrections

To investigate whether students favored peer assessment, an analysis of chi-square was run to compare the students’ attitudes as measured through the pre- and post- questionnaire. The chi-square value of 7.65 (P = .022 < .05) indicates that there were significant differences between students’ attitudes toward peer assessment before and after the study. As displayed in Table 3, the students showed more agreement on the post-questionnaire (52.9%) rather than the pre-questionnaire (44.4%).

Table 3 Frequencies and percentages of learners’ attitude toward peer correction

In addition to the abovementioned findings, to show what learners thought about peer assessment and how they found it after experiencing it, Table 4 is presented. It includes frequencies and percentages of each response to 5 of the questions about peer assessment being difficult, useful, interesting, motivating, and boring. The details in this table indicate how learners’ views changed in each case; expecting the practice to be difficult changed to the opposite, not being sure about whether it would be useful, motivating and interesting changed to learners’ certainty about them, and it was found not to be boring at the end of the term.

Table 4 Frequencies and percentages of peer assessment features

Another question in this study was about the effect of friendship on peer rating. In order to investigate any possible bias first of all, the average of the scores offered by peer friends and non-friends for the 3 writings were separately calculated. The results of the paired-samples t-test with the t-value of 1.55 and the p-value of .132 > .05 show that there was no significant difference between the friend and non-friend corrections (Table 5).

Table 5 Descriptive statistics for friend and non-friend corrections

The findings of this study concerning peer and teacher assessment are in line with the studies of Jafarpur (1991), Hughs and Large (1993), Miller and Ng (1996), Topping (1998), Falchikov and Goldfinch (2000), Patri (2002), and Saito and Fujita (2004) who have noted high agreement between teacher and peer assessments which indicate an overall similarity in scoring between peers and teachers. The reason behind this agreement may be found in using a clear scoring criterion, as well as the training and practice sessions prior to the actual peer assessment experience.

Concerning friendship bias, this study revealed no significant difference between ratings of friend and non-friend peers while Falchikov (1995) and Morahan-Martin (1996) identified such a bias in peer assessment. The probable reason for this difference in findings may be in the general familiarity and friendship of all the students with one another in the class. Although students named their intimate friends, they did not deny their overall friendship with others who had been their classmates for at least 2 years, so this might have affected their ratings unconsciously. Another point is the possible fear of facing the friends the next week in the class after issuing someone a bad grade (Buchanan, 2004, cited in Roberts, 2006). This problem might be overcome by monitoring and anonymous marking (Alfallay, 2004) which was unfortunately not possible in this study.

The present study also investigated the attitudes of learners towards the use of peer assessment. The change of perception and the positive view points of learners at the end of the course toward the use of peer assessment is similar to users acceptance and positive attitudes found in Patri's (2002) and Saito and Fujita's (2004) studies. Although some students expressed their discomfort and uneasiness about acting like a teacher and were not sure about the benefits and the degree of difficulty of peer assessment, the post-questionnaire revealed their change of perception about this practice. Saito and Fujita (2004) cited a number of researches in some of which learners expressed their negative feelings, dissatisfaction and uneasiness whit this experience while in others students considered it useful, preferred, and found value in it. These mixed feelings are appropriate since learners usually carry mixed feelings and attitudes toward any type of classroom activity.

Conclusion

This study was a tripartite investigation on peer assessment. First, it focused on the differences between peer and teacher assessment in a teacher-centered foreign language learning context. Next, presence of any friendship bias was detected; and finally, learners’ attitudes about this practice were evaluated. The results of this study revealed no significant difference between the learners’ peer assessment and teachers’ assessment. Moreover, no friendship bias was found in peer assessment. However, this practice led to the change of students’ attitude to a positive perspective on peer assessment. While they expected the practice to be difficult, they found it not to be so; learners became assured that peer assessment was useful, motivating and interesting and they found it not to be boring.

Making peer assessment an integral part of evaluation procedures not only encourages learners and teachers to regard assessment as a shared responsibility, it can also be applied to alter the traditional one-way teacher-centered classes to a more learner-centered one. It is obvious that peer involvement creates opportunities for interaction, and increases objectivity in assessment. Saito (2008) believes peer assessment encourages reflective learning through observing others' performances and becoming aware of performance criteria. In general, peer assessment seems to generate positive reactions in students, although some students have concerns and worries it leads to the development of self-awareness, noticing the gap between one's and others' perception, and facilitating further learning and responsibility for it. In addition, focusing on peers' strengths and weaknesses can enhance students' learning, raise their level of critical thinking, and lead them to autonomy.

The results of this study, although statistically significant, are limited to a number of factors such as the design, the instruments, and the chosen skill. In addition, the subjects who were third-year undergraduate EFL students, their familiarity with one another, their proficiency level, and the impossibility of performing blind assessment could have had limiting effects on the results. Therefore, for further studies, taking a broader range and types of participants in to account, considering the link between peer assessment and skills other than writing, examining the effects of gender and various cognitive and personality factors on peer assessment, and applying other types of instruments are some suggestions to offer.

Author’s information

Maryam Azarnoosh has a PhD in TEFL and is a faculty member and head of department of English at Islamic Azad University-Semnan Branch. She has taught in different universities for over 13 years and has presented papers in international conferences and published some in journals. Her research interests include motivation, English language skills, learning strategies, and language teaching, testing and assessment.