Teacher-Examiners’ Explicit and Enacted Beliefs About Proficiency Indicators in National Oral Assessments

Goh, Christine C. M.; Ang-Aw, Hui Teng

doi:10.1007/978-3-319-77177-9_11

Christine C. M. Goh³ &
Hui Teng Ang-Aw³

703 Accesses

Abstract

The test of English oral proficiency is important in High-Stakes national examinations in which large numbers of teachers are involved as examiners. Although the literature shows that the reliability of oral assessments is often threatened by rater variability, to date the role of teacher beliefs in teacher-rater judgements has received little attention. This exploratory qualitative study conducted in Singapore identified teachers’ beliefs about the construct of oral proficiency for their assessment of secondary school candidates and examined the extent to which these beliefs had been enacted in real-time assessment. Seven experienced national-level examiners participated in this study. They listened to audio-recordings of four students performing an oral interview (conversation) task in a simulated examination and assessed the performance of each of them individually. Data about teachers’ thinking which revealed their underlying beliefs when assessing was elicited through Concurrent Verbal Protocol (CVP) sessions. In addition, a questionnaire was administered a month later to elicit their explicit beliefs. Findings showed that teachers possessed a range of beliefs about the construct of oral proficiency but only some of these formed the core of their expressed criteria when assessing student performance in real time. Implications for oral assessments and further research are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alderson, J. C. (1991). Language testing in the 1990s: How far have we come? How much further have we to go? In S. Anivan (Ed.), Current development in language testing (pp. 1–26). Singapore: SEAMEO Regional Language Centre.
Google Scholar
Bachman, L. F., Lynch, B. K., & Mason, M. (1995). Investigating variability in tasks and rater judgements in a performance test of foreign language speaking. Language Testing, 12, 238–257.
Article Google Scholar
Black, P., Harrison, C., Hodgen, J., Marshall, B., & Serret, N. (2011). Can teachers’ summative assessments produce dependable results and also enhance classroom learning? Assessment in Education: Principles, Policy & Practice, 18, 451–469.
Article Google Scholar
Bonk, W. J., & Ockey, G. J. (2003). A many-facet Rasch analysis of the second language group oral discussion task. Language Testing, 20(1), 89–110.
Article Google Scholar
Borg, S. (2006). Teacher cognition and language education: Research and practice. London: Continuum.
Google Scholar
Brookhart, S. M. (2013). The use of teacher judgement for summative assessment in the USA. Assessment in Education: Principles, Policy & Practice, 20, 69–90.
Article Google Scholar
Brown, A. (2000). An investigation of the rating process in the IELTS oral interview. In R. Tulloh (Ed.), IELTS research reports 2000 (Vol. 3, pp. 49–84). Canberra: IELTS Australia.
Google Scholar
Brown, A. (2003). Interviewer variation and the co-construction of speaking proficiency. Language Testing, 20, 1–25.
Article Google Scholar
Brown, A. (2006). An examination of the rating process in the revised IELTS speaking test. In P. McGovern & S. Walsh (Eds.), IELTS research reports 2006 (Vol. 6, pp. 1–30). Canberra: IELTS Australia.
Google Scholar
Buck, S., Ritter, G. W., Jensen, N. C., & Rose, C. P. (2010). Teachers say the most interesting things: An alternative view of testing. Phi Delta Kappan, 91, 50–54.
Article Google Scholar
Carey, M. D., Mannell, R. H., & Dunn, P. K. (2011). Does a rater’s familiarity with a candidate’s pronunciation affect the rating in oral proficiency interviews? Language Testing, 28, 201–219.
Article Google Scholar
Chalhoub-Deville, M. (1995a). Deriving oral assessment scales across different tests and rater group. Language Learning, 45, 251–281.
Article Google Scholar
Chalhoub-Deville, M. (1995b). A contextualised approach to describing oral language proficiency. Language Testing, 12, 16–33.
Article Google Scholar
Cheng, L. (2008). Washback, impact and consequences. In E. Shohamy & N. H. Hornberger (Eds.), Encyclopedia of language and education (Vol. 7, 2nd ed., pp. 349–364). New York: Springer.
Google Scholar
Costigan, A. T., III. (2002). Teaching the culture of high-stakes testing: Listening to new teachers. Action in Teacher Education, 23, 28–34.
Article Google Scholar
Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33, 117–135.
Article Google Scholar
Davison, C. (2004). The contradictory culture of teacher-based assessment: ESL teacher assessment practices in Australian and Hong Kong secondary schools. Language Testing, 21, 305–334.
Article Google Scholar
Douglas, D. (1994). Quantity and quality in speaking test performance. Language Testing, 11(2), 125–144.
Article Google Scholar
Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis. Cambridge, MA: MIT press.
Google Scholar
Fang, Z. (1996). A review of research on teacher beliefs and practices. Educational Research, 38, 47–65.
Article Google Scholar
Fulcher, G. (2003). Testing second language speaking. London: Pearson Education.
Google Scholar
Fulcher, G. (2015). Assessing second language speaking. Language Teaching, 48, 198–216.
Article Google Scholar
Gambell, T., & Hunter, D. (2004). Teacher scoring of large-scale assessment: Professional development or debilitation? Journal of Curriculum Studies, 36, 697–724.
Article Google Scholar
Goh, C., & Burns, A. (2012). Teaching speaking: A holistic approach. New York: Cambridge University Press.
Google Scholar
Goh, C., Zhang, L. J., Ng, C. H., & Koh, G. H. (2005). Knowledge, beliefs and syllabus implementation: A study of English Language teachers in Singapore. Singapore: Graduate Programmes and Research Office, National Institute of Education, Nanyang Technological University.
Google Scholar
Goh, C. (2009). Perspectives on spoken grammar. ELT Journal, 63(4), 303–312.
Article Google Scholar
Goldberg, G. L. (2012). Judgment-based scoring by teachers as professional development: Distinguishing promises from proof. Educational Measurement: Issues and Practice, 31, 38–47.
Article Google Scholar
Green, A. (1998). Verbal protocol analysis in language testing research: A handbook. Cambridge: Cambridge University Press.
Google Scholar
Gulek, C. (2003). Preparing for high-stakes testing. Theory Into Practice, 42, 42–50.
Article Google Scholar
Hadden, B. L. (1991). Teacher and nonteacher perceptions of second language communication. Language Learning, 41, 1–20.
Article Google Scholar
Harlen, W. (2005). Teachers summative practices and assessment for learning: Tensions and synergies. Curriculum Journal, 16, 207–223.
Article Google Scholar
Huang, B. H. (2013). The effects of accent familiarity and language teaching experience on raters’ judgments of non-native speech. System, 41, 770–785.
Article Google Scholar
Jenkins, S., & Parra, I. (2003). Multiple layers of meaning in an oral proficiency test: The complementary roles of nonverbal, paralinguistic, and verbal behaviors in assessment decisions. Modern Language Journal, 87, 90–107.
Article Google Scholar
Jin, T., Mak, B., & Zhou, P. (2011). Confidence scoring of speaking performance: How does fuzziness become exact? Language Testing, 29, 43–65.
Article Google Scholar
Johnson, K. E. (1992). The relationship between teachers’ beliefs and practices during literacy instruction for non-native speakers of English. Journal of Literacy Research, 24, 83–108.
Google Scholar
Joe, J. N., Harmes, J. C., & Hickerson, C. A. (2011). Using verbal reports to explore rater perceptual processes in scoring: A mixed methods application to oral communication assessment. Assessment in Education: Principles, Policy & Practice, 18(3), 239–258.
Article Google Scholar
Klenowski, V., & Wyatt-Smith, C. (2012). The impact of high-stakes testing: The Australian story. Assessment in Education: Principles, Policy & Practice, 19, 65–79.
Article Google Scholar
Koh, C. H. C. (2003). An exploratory study of three raters’ decision-making process of the picture conversation task used for primary six candidates in Singapore. Dissertation, National Institute of Education, Nanyang Technological University, Singapore.
Google Scholar
Lazaraton, A. (1996a). A qualitative approach to monitoring examiner conduct in the Cambridge assessment of spoken English (CASE). In M. Milanovic & N. Saville (Eds.), Performance testing, cognition and assessment: Selected papers from the 15th Language Testing Research Colloquium (LRTC), Cambridge and Arnhem (pp. 18–33). Cambridge: Cambridge University Press.
Google Scholar
Lazaraton, A. (1996b). Interlocutor support in oral proficiency interviews: The case of CASE. Language Testing, 13, 151–172.
Article Google Scholar
Lazaraton, A. (2002). A qualitative approach to the validation of oral language tests. Cambridge: Cambridge University Press.
Google Scholar
Lumley, T. (1998). Perceptions of language-trained raters and occupational experts in a test of occupational English language proficiency. English for Specific Purposes, 17, 347–367.
Article Google Scholar
Lumley, T., & McNamara, T. F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing, 12, 54–71.
Article Google Scholar
McNamara, T. F. (1996). Measuring second language performance. London/New York: Longman.
Google Scholar
Milanovic, M., Saville, N., & Shuhong, S. (1996). A study of the decision-making behaviour of composition markers. In M. Milanovic & N. Saville (Eds.), Performance testing, cognition and assessment: Selected papers from the 15th Language Testing Research Colloquium (LRTC), Cambridge and Arnhem (pp. 92–111). Cambridge: Cambridge University Press.
Google Scholar
Morgan, C. (1996). The teacher as examiner: The case of mathematics coursework. Assessment in Education: Principles, Policy & Practice, 3, 353–375.
Article Google Scholar
Newton, P. E., & Meadows, M. (2011). Marking quality within test and examination systems. Assessment in Education: Principles, Policy & Practice, 18, 213–216.
Article Google Scholar
Orr, M. (2002). The FCE speaking test: Using rater reports to help interpret test scores. System, 30, 143–154.
Article Google Scholar
Pajares, M. F. (1992). Teachers’ beliefs and educational research: Cleaning up a messy construct. Review of Educational Research, 62, 307–332.
Article Google Scholar
Pollitt, A., & Murray, N. L. (1996). What raters really pay attention to. In M. Milanovic & N. Saville (Eds.), Performance testing, cognition and assessment: Selected papers from the 15th Language Testing Research Colloquium (LRTC), Cambridge and Arnhem (pp. 74–91). Cambridge: Cambridge University Press.
Google Scholar
Popham, W. J. (2009). Assessment literacy for teachers: Faddish or fundamental? Theory Into Practice, 48, 4–11.
Article Google Scholar
Richards, J. C., & Schmidt, R. (2002). Dictionary of language teaching and applied linguistics (3rd ed.). Harlow: Longman.
Google Scholar
Sato, T. (2012). The contribution of test-takers’ speech content to scores on an English oral proficiency test. Language Testing, 29, 223–241.
Article Google Scholar
Scarino, A. (2013). Language assessment literacy as self-awareness: Understanding the role of interpretation in assessment and in teacher learning. Language Testing, 30, 309–327.
Article Google Scholar
Van Lier, L. (1989). Classroom research in second language acquisition. Annual Review of Applied Linguistics, 10, 173–186.
Article Google Scholar
Wigglesworth, G. (1994). Patterns of rater behaviour in the assessment of an oral interaction test. Australian Review of Applied Linguistics, 17, 77–103.
Article Google Scholar
Winke, P., Gass, S., & Myford, C. (2013). Raters’ L2 background as a potential source of bias in rating oral performance. Language Testing, 30, 231–252.
Article Google Scholar
Wyatt-Smith, C., & Klenowski, V. (2013). Explicit, latent and meta-criteria: Types of criteria at play in professional judgement practice. Assessment in Education: Principles, Policy & Practice, 20, 35–52.
Article Google Scholar
Xi, X. (2007). Evaluating analytic scoring for the TOEFL [R] Academic Speaking Test (TAST) for operational use. Language Testing, 24, 251–286.
Article Google Scholar
Yan, X. (2014). An examination of rater performance on a local oral English proficiency test: A mixed-methods approach. Language Testing, 31, 501–527.
Article Google Scholar
Yung, B. H. W. (2001). Examiner, policeman or students’ companion: Teachers’ perceptions of their role in an assessment reform. Educational Review, 53, 251–260.
Article Google Scholar
Zhang, Y., & Elder, C. (2014). Investigating native and non-native English-speaking teacher raters’ judgements of oral proficiency in the College English Test-Spoken English Test (CET-SET). Assessment in Education: Principles, Policy & Practice, 21, 306–325.
Article Google Scholar

Download references

Author information

Authors and Affiliations

National Institute of Education, Nanyang Technological University, Singapore, Singapore
Christine C. M. Goh & Hui Teng Ang-Aw

Authors

Christine C. M. Goh
View author publications
You can also search for this author in PubMed Google Scholar
Hui Teng Ang-Aw
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christine C. M. Goh .

Editor information

Editors and Affiliations

Centre for English Language Proficiency, University of Malta, Msida, Malta
Daniel Xerri
Department of English Junior College, University of Malta, Msida, Malta
Patricia Vella Briffa

Appendix: Teacher Beliefs About Oral Proficiency (TeBOP)

Instruction

Use the following scale (numbers 1 through 4) to describe what you think about each of the statements below. For each statement, circle the number that gives the best description of what you believe to be true.

I consider the features listed below to be important when I assess a candidate’s oral proficiency in oral interview/conversation tasks.

Phonology
1. Stress	1	2	3	4
2. Rhythm	1	2	3	4
3. Intonation	1	2	3	4
4. Pronunciation	1	2	3	4
Language
5. Grammar	1	2	3	4
6. Vocabulary	1	2	3	4
7. Use of standard English	1	2	3	4
8. Uses a range of sentence structures correctly	1	2	3	4
Fluency
9. Hesitation	1	2	3	4
10. Repetition	1	2	3	4
11. Restructuring sentences	1	2	3	4
12. Reselecting vocabulary	1	2	3	4
Communication strategies
13. Achievement strategies to (paraphrase, circumlocution, etc.,)	1	2	3	4
14. Interaction strategies (clarification, ask for repetition, etc.)	1	2	3	4
15. Avoidance strategy (avoid unfamiliar topics)	1	2	3	4
Topical knowledge
16. Has interesting ideas	1	2	3	4
17. Elaborates ideas	1	2	3	4
18. Expresses ideas clearly	1	2	3	4
19. Gives a relevant personal response	1	2	3	4
20. Displays maturity in ideas	1	2	3	4
21. Displays breadth of knowledge	1	2	3	4
22. Displays depth of knowledge	1	2	3	4
23. Uses a range of relevant vocabulary	1	2	3	4
Discourse
24. Expresses ideas cohesively and coherently	1	2	3	4
25. Initiates discussion/conversation with the examiner	1	2	3	4
26. Concludes discussion/conversation	1	2	3	4
Personal characteristics
27. Interacts easily with the examiner	1	2	3	4
28. Enthusiastic about what he/she says	1	2	3	4
29. Responds enthusiastically to prompts	1	2	3	4
30. Shows effort	1	2	3	4
31. Good grooming	1	2	3	4
32. Confident	1	2	3	4
33. A pleasant voice	1	2	3	4

If you can, please explain the reasons for your choices in each of the categories:

Phonology
Accuracy
Fluency
Communication strategies
Topical knowledge
Discourse management
Personal characteristics

One or two other features, if any, you would like to add.

34.	1	2	3	4
35.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Goh, C.C.M., Ang-Aw, H.T. (2018). Teacher-Examiners’ Explicit and Enacted Beliefs About Proficiency Indicators in National Oral Assessments. In: Xerri, D., Vella Briffa, P. (eds) Teacher Involvement in High-Stakes Language Testing. Springer, Cham. https://doi.org/10.1007/978-3-319-77177-9_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-77177-9_11
Published: 28 April 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77175-5
Online ISBN: 978-3-319-77177-9
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics

Teacher-Examiners’ Explicit and Enacted Beliefs About Proficiency Indicators in National Oral Assessments

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: Teacher Beliefs About Oral Proficiency (TeBOP)

Appendix: Teacher Beliefs About Oral Proficiency (TeBOP)

Instruction

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation