Abstract
The test of English oral proficiency is important in High-Stakes national examinations in which large numbers of teachers are involved as examiners. Although the literature shows that the reliability of oral assessments is often threatened by rater variability, to date the role of teacher beliefs in teacher-rater judgements has received little attention. This exploratory qualitative study conducted in Singapore identified teachers’ beliefs about the construct of oral proficiency for their assessment of secondary school candidates and examined the extent to which these beliefs had been enacted in real-time assessment. Seven experienced national-level examiners participated in this study. They listened to audio-recordings of four students performing an oral interview (conversation) task in a simulated examination and assessed the performance of each of them individually. Data about teachers’ thinking which revealed their underlying beliefs when assessing was elicited through Concurrent Verbal Protocol (CVP) sessions. In addition, a questionnaire was administered a month later to elicit their explicit beliefs. Findings showed that teachers possessed a range of beliefs about the construct of oral proficiency but only some of these formed the core of their expressed criteria when assessing student performance in real time. Implications for oral assessments and further research are discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alderson, J. C. (1991). Language testing in the 1990s: How far have we come? How much further have we to go? In S. Anivan (Ed.), Current development in language testing (pp. 1–26). Singapore: SEAMEO Regional Language Centre.
Bachman, L. F., Lynch, B. K., & Mason, M. (1995). Investigating variability in tasks and rater judgements in a performance test of foreign language speaking. Language Testing, 12, 238–257.
Black, P., Harrison, C., Hodgen, J., Marshall, B., & Serret, N. (2011). Can teachers’ summative assessments produce dependable results and also enhance classroom learning? Assessment in Education: Principles, Policy & Practice, 18, 451–469.
Bonk, W. J., & Ockey, G. J. (2003). A many-facet Rasch analysis of the second language group oral discussion task. Language Testing, 20(1), 89–110.
Borg, S. (2006). Teacher cognition and language education: Research and practice. London: Continuum.
Brookhart, S. M. (2013). The use of teacher judgement for summative assessment in the USA. Assessment in Education: Principles, Policy & Practice, 20, 69–90.
Brown, A. (2000). An investigation of the rating process in the IELTS oral interview. In R. Tulloh (Ed.), IELTS research reports 2000 (Vol. 3, pp. 49–84). Canberra: IELTS Australia.
Brown, A. (2003). Interviewer variation and the co-construction of speaking proficiency. Language Testing, 20, 1–25.
Brown, A. (2006). An examination of the rating process in the revised IELTS speaking test. In P. McGovern & S. Walsh (Eds.), IELTS research reports 2006 (Vol. 6, pp. 1–30). Canberra: IELTS Australia.
Buck, S., Ritter, G. W., Jensen, N. C., & Rose, C. P. (2010). Teachers say the most interesting things: An alternative view of testing. Phi Delta Kappan, 91, 50–54.
Carey, M. D., Mannell, R. H., & Dunn, P. K. (2011). Does a rater’s familiarity with a candidate’s pronunciation affect the rating in oral proficiency interviews? Language Testing, 28, 201–219.
Chalhoub-Deville, M. (1995a). Deriving oral assessment scales across different tests and rater group. Language Learning, 45, 251–281.
Chalhoub-Deville, M. (1995b). A contextualised approach to describing oral language proficiency. Language Testing, 12, 16–33.
Cheng, L. (2008). Washback, impact and consequences. In E. Shohamy & N. H. Hornberger (Eds.), Encyclopedia of language and education (Vol. 7, 2nd ed., pp. 349–364). New York: Springer.
Costigan, A. T., III. (2002). Teaching the culture of high-stakes testing: Listening to new teachers. Action in Teacher Education, 23, 28–34.
Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33, 117–135.
Davison, C. (2004). The contradictory culture of teacher-based assessment: ESL teacher assessment practices in Australian and Hong Kong secondary schools. Language Testing, 21, 305–334.
Douglas, D. (1994). Quantity and quality in speaking test performance. Language Testing, 11(2), 125–144.
Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis. Cambridge, MA: MIT press.
Fang, Z. (1996). A review of research on teacher beliefs and practices. Educational Research, 38, 47–65.
Fulcher, G. (2003). Testing second language speaking. London: Pearson Education.
Fulcher, G. (2015). Assessing second language speaking. Language Teaching, 48, 198–216.
Gambell, T., & Hunter, D. (2004). Teacher scoring of large-scale assessment: Professional development or debilitation? Journal of Curriculum Studies, 36, 697–724.
Goh, C., & Burns, A. (2012). Teaching speaking: A holistic approach. New York: Cambridge University Press.
Goh, C., Zhang, L. J., Ng, C. H., & Koh, G. H. (2005). Knowledge, beliefs and syllabus implementation: A study of English Language teachers in Singapore. Singapore: Graduate Programmes and Research Office, National Institute of Education, Nanyang Technological University.
Goh, C. (2009). Perspectives on spoken grammar. ELT Journal, 63(4), 303–312.
Goldberg, G. L. (2012). Judgment-based scoring by teachers as professional development: Distinguishing promises from proof. Educational Measurement: Issues and Practice, 31, 38–47.
Green, A. (1998). Verbal protocol analysis in language testing research: A handbook. Cambridge: Cambridge University Press.
Gulek, C. (2003). Preparing for high-stakes testing. Theory Into Practice, 42, 42–50.
Hadden, B. L. (1991). Teacher and nonteacher perceptions of second language communication. Language Learning, 41, 1–20.
Harlen, W. (2005). Teachers summative practices and assessment for learning: Tensions and synergies. Curriculum Journal, 16, 207–223.
Huang, B. H. (2013). The effects of accent familiarity and language teaching experience on raters’ judgments of non-native speech. System, 41, 770–785.
Jenkins, S., & Parra, I. (2003). Multiple layers of meaning in an oral proficiency test: The complementary roles of nonverbal, paralinguistic, and verbal behaviors in assessment decisions. Modern Language Journal, 87, 90–107.
Jin, T., Mak, B., & Zhou, P. (2011). Confidence scoring of speaking performance: How does fuzziness become exact? Language Testing, 29, 43–65.
Johnson, K. E. (1992). The relationship between teachers’ beliefs and practices during literacy instruction for non-native speakers of English. Journal of Literacy Research, 24, 83–108.
Joe, J. N., Harmes, J. C., & Hickerson, C. A. (2011). Using verbal reports to explore rater perceptual processes in scoring: A mixed methods application to oral communication assessment. Assessment in Education: Principles, Policy & Practice, 18(3), 239–258.
Klenowski, V., & Wyatt-Smith, C. (2012). The impact of high-stakes testing: The Australian story. Assessment in Education: Principles, Policy & Practice, 19, 65–79.
Koh, C. H. C. (2003). An exploratory study of three raters’ decision-making process of the picture conversation task used for primary six candidates in Singapore. Dissertation, National Institute of Education, Nanyang Technological University, Singapore.
Lazaraton, A. (1996a). A qualitative approach to monitoring examiner conduct in the Cambridge assessment of spoken English (CASE). In M. Milanovic & N. Saville (Eds.), Performance testing, cognition and assessment: Selected papers from the 15th Language Testing Research Colloquium (LRTC), Cambridge and Arnhem (pp. 18–33). Cambridge: Cambridge University Press.
Lazaraton, A. (1996b). Interlocutor support in oral proficiency interviews: The case of CASE. Language Testing, 13, 151–172.
Lazaraton, A. (2002). A qualitative approach to the validation of oral language tests. Cambridge: Cambridge University Press.
Lumley, T. (1998). Perceptions of language-trained raters and occupational experts in a test of occupational English language proficiency. English for Specific Purposes, 17, 347–367.
Lumley, T., & McNamara, T. F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing, 12, 54–71.
McNamara, T. F. (1996). Measuring second language performance. London/New York: Longman.
Milanovic, M., Saville, N., & Shuhong, S. (1996). A study of the decision-making behaviour of composition markers. In M. Milanovic & N. Saville (Eds.), Performance testing, cognition and assessment: Selected papers from the 15th Language Testing Research Colloquium (LRTC), Cambridge and Arnhem (pp. 92–111). Cambridge: Cambridge University Press.
Morgan, C. (1996). The teacher as examiner: The case of mathematics coursework. Assessment in Education: Principles, Policy & Practice, 3, 353–375.
Newton, P. E., & Meadows, M. (2011). Marking quality within test and examination systems. Assessment in Education: Principles, Policy & Practice, 18, 213–216.
Orr, M. (2002). The FCE speaking test: Using rater reports to help interpret test scores. System, 30, 143–154.
Pajares, M. F. (1992). Teachers’ beliefs and educational research: Cleaning up a messy construct. Review of Educational Research, 62, 307–332.
Pollitt, A., & Murray, N. L. (1996). What raters really pay attention to. In M. Milanovic & N. Saville (Eds.), Performance testing, cognition and assessment: Selected papers from the 15th Language Testing Research Colloquium (LRTC), Cambridge and Arnhem (pp. 74–91). Cambridge: Cambridge University Press.
Popham, W. J. (2009). Assessment literacy for teachers: Faddish or fundamental? Theory Into Practice, 48, 4–11.
Richards, J. C., & Schmidt, R. (2002). Dictionary of language teaching and applied linguistics (3rd ed.). Harlow: Longman.
Sato, T. (2012). The contribution of test-takers’ speech content to scores on an English oral proficiency test. Language Testing, 29, 223–241.
Scarino, A. (2013). Language assessment literacy as self-awareness: Understanding the role of interpretation in assessment and in teacher learning. Language Testing, 30, 309–327.
Van Lier, L. (1989). Classroom research in second language acquisition. Annual Review of Applied Linguistics, 10, 173–186.
Wigglesworth, G. (1994). Patterns of rater behaviour in the assessment of an oral interaction test. Australian Review of Applied Linguistics, 17, 77–103.
Winke, P., Gass, S., & Myford, C. (2013). Raters’ L2 background as a potential source of bias in rating oral performance. Language Testing, 30, 231–252.
Wyatt-Smith, C., & Klenowski, V. (2013). Explicit, latent and meta-criteria: Types of criteria at play in professional judgement practice. Assessment in Education: Principles, Policy & Practice, 20, 35–52.
Xi, X. (2007). Evaluating analytic scoring for the TOEFL [R] Academic Speaking Test (TAST) for operational use. Language Testing, 24, 251–286.
Yan, X. (2014). An examination of rater performance on a local oral English proficiency test: A mixed-methods approach. Language Testing, 31, 501–527.
Yung, B. H. W. (2001). Examiner, policeman or students’ companion: Teachers’ perceptions of their role in an assessment reform. Educational Review, 53, 251–260.
Zhang, Y., & Elder, C. (2014). Investigating native and non-native English-speaking teacher raters’ judgements of oral proficiency in the College English Test-Spoken English Test (CET-SET). Assessment in Education: Principles, Policy & Practice, 21, 306–325.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Teacher Beliefs About Oral Proficiency (TeBOP)
Appendix: Teacher Beliefs About Oral Proficiency (TeBOP)
Instruction
Use the following scale (numbers 1 through 4) to describe what you think about each of the statements below. For each statement, circle the number that gives the best description of what you believe to be true.
I consider the features listed below to be important when I assess a candidate’s oral proficiency in oral interview/conversation tasks.
Phonology | ||||
1. Stress | 1 | 2 | 3 | 4 |
2. Rhythm | 1 | 2 | 3 | 4 |
3. Intonation | 1 | 2 | 3 | 4 |
4. Pronunciation | 1 | 2 | 3 | 4 |
Language | ||||
5. Grammar | 1 | 2 | 3 | 4 |
6. Vocabulary | 1 | 2 | 3 | 4 |
7. Use of standard English | 1 | 2 | 3 | 4 |
8. Uses a range of sentence structures correctly | 1 | 2 | 3 | 4 |
Fluency | ||||
9. Hesitation | 1 | 2 | 3 | 4 |
10. Repetition | 1 | 2 | 3 | 4 |
11. Restructuring sentences | 1 | 2 | 3 | 4 |
12. Reselecting vocabulary | 1 | 2 | 3 | 4 |
Communication strategies | ||||
13. Achievement strategies to (paraphrase, circumlocution, etc.,) | 1 | 2 | 3 | 4 |
14. Interaction strategies (clarification, ask for repetition, etc.) | 1 | 2 | 3 | 4 |
15. Avoidance strategy (avoid unfamiliar topics) | 1 | 2 | 3 | 4 |
Topical knowledge | ||||
16. Has interesting ideas | 1 | 2 | 3 | 4 |
17. Elaborates ideas | 1 | 2 | 3 | 4 |
18. Expresses ideas clearly | 1 | 2 | 3 | 4 |
19. Gives a relevant personal response | 1 | 2 | 3 | 4 |
20. Displays maturity in ideas | 1 | 2 | 3 | 4 |
21. Displays breadth of knowledge | 1 | 2 | 3 | 4 |
22. Displays depth of knowledge | 1 | 2 | 3 | 4 |
23. Uses a range of relevant vocabulary | 1 | 2 | 3 | 4 |
Discourse | ||||
24. Expresses ideas cohesively and coherently | 1 | 2 | 3 | 4 |
25. Initiates discussion/conversation with the examiner | 1 | 2 | 3 | 4 |
26. Concludes discussion/conversation | 1 | 2 | 3 | 4 |
Personal characteristics | ||||
27. Interacts easily with the examiner | 1 | 2 | 3 | 4 |
28. Enthusiastic about what he/she says | 1 | 2 | 3 | 4 |
29. Responds enthusiastically to prompts | 1 | 2 | 3 | 4 |
30. Shows effort | 1 | 2 | 3 | 4 |
31. Good grooming | 1 | 2 | 3 | 4 |
32. Confident | 1 | 2 | 3 | 4 |
33. A pleasant voice | 1 | 2 | 3 | 4 |
If you can, please explain the reasons for your choices in each of the categories:
Phonology |
Accuracy |
Fluency |
Communication strategies |
Topical knowledge |
Discourse management |
Personal characteristics |
One or two other features, if any, you would like to add.
34. | 1 | 2 | 3 | 4 |
35. |
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Goh, C.C.M., Ang-Aw, H.T. (2018). Teacher-Examiners’ Explicit and Enacted Beliefs About Proficiency Indicators in National Oral Assessments. In: Xerri, D., Vella Briffa, P. (eds) Teacher Involvement in High-Stakes Language Testing. Springer, Cham. https://doi.org/10.1007/978-3-319-77177-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-77177-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77175-5
Online ISBN: 978-3-319-77177-9
eBook Packages: EducationEducation (R0)