Avoiding Scoring Malpractice: Supporting Reliable Scoring of Constructed-Response Items in High-Stakes Exams

Leitner, Kristina; Kremmel, Benjamin

doi:10.1007/978-981-33-4232-3_10

Avoiding Scoring Malpractice: Supporting Reliable Scoring of Constructed-Response Items in High-Stakes Exams

Kristina Leitner⁴ &
Benjamin Kremmel⁵

Chapter
First Online: 18 February 2021

583 Accesses

Abstract

Scoring reliability of constructed-response items is a key concern in high-stakes testing. Constructed-response items, often used for their authenticity, potentially allow for a multitude of acceptable answers that were neither intended nor anticipated, and can therefore be problematic for reliable scoring. This chapter examines the use of a specially developed marker support system for the Austrian EFL school-leaving exam, which uses such items but without centralized marking and therefore potentially suffers from inconsistent scoring that could affect 40,000 students annually. The study investigates the impact of three different scoring guide conditions on test taker results in four constructed-response tasks for listening at CEFR B2 level. The first scoring condition (A) is exact scoring based on the scoring guide developed by the item writing team before the task had been field-tested. The second scoring condition (B) is based on an extended scoring guide that was improved in a centrally run scoring session after piloting the items. The third scoring condition (C) is based on the highly comprehensive scoring guide that was enhanced during the scoring of the national live exam through a marker support system in the form of an online helpdesk and a telephone hotline. The statistical analyzes show an overall improvement in the reliability of the test from scoring condition A to scoring condition C. Consequently, the findings of the study suggest that the practice of improving and refining the scoring guides through the implemented marker support system increase the comparability, reliability, and fairness in test taker scores.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Alderson, J. C. (2000). Assessing reading. Cambridge: Cambridge University Press.
Book Google Scholar
Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice: Developing language assessments and justifying their use in the real world. Oxford, UK: Oxford University Press.
Google Scholar
BMUKK [Bundesministerium für Unterricht, Kunst und Kultur]. (2004). Oberstufenlehrplan für die Erste und Zweite Lebende Fremdsprache für Allgemein Bildende Höhere Schulen. http://www.bmukk.gv.at/medienpool/11854/lebendefremdsprache_ost_neu0.pdf. Accessed 13 October 2013.
Brindley, G. (1998). Assessing listening abilities. Annual Review of Applied Linguistics, 18, 171–191.
Article Google Scholar
Brown, G., & Yule, G. (1983). Teaching the spoken language. Cambridge: Cambridge University Press.
Google Scholar
Buck, G. (2001). Assessing listening. Cambridge, UK: Cambridge University Press.
Book Google Scholar
Council of Europe. (2001). Common European framework of reference for languages: Learning, teaching, assessment. http://www.coe.int/t/dg4/linguistic/Source/Framework_EN.pdf. Accessed 2 November 2013.
Eberharter, K., & Frötscher, D. (2013). Quality control in marking open-ended listening and reading test items. In D. Tsagari, S. Papadima-Sophocleous, & S. Ioannou-Georgiou (Eds.), International experiences in language testing and assessment: Selected papers in memory of Pavlos Pavlou (pp. 229–242). Frankfurt: Peter Lang.
Google Scholar
Elliott, W., & Wilson, J. (2013). Context validity. In A. Geranpayeh & L. B. Taylor (Eds.), Examining listening: Research and practice in assessing second language listening (pp. 152–241). Cambridge, UK: Cambridge University Press.
Google Scholar
Field, J. (2013). Cognitive validity. In A. Geranpayeh & L. B. Taylor (Eds.), Examining listening: Research and practice in assessing second language listening (pp. 77–151). Cambridge, UK: Cambridge University Press.
Google Scholar
Green, R. (2013). Statistical analyses for language testing. Basingstoke: Palgrave Macmillan.
Book Google Scholar
Hackett, E., Geranpayeh, A., & Somers, A. (2006). Listening skills group spelling project: Investigating the impact of the revision of an FCE 4 productive task mark scheme based on the recommendations of four external consultants (Cambridge ESOL Internal Report).
Google Scholar
Harding, L., Pill, J., & Ryan, K. (2011). Assessor decision making while marking a note-taking listening test: The case of the OET. Language Assessment Quarterly, 8(2), 108–126.
Article Google Scholar
Harding, L., & Ryan, K. (2009). Decision making in marking open-ended listening test items: The case of the OET. Spaan Fellow Working Papers in Second or Foreign Language Assessment, 7, 99–114.
Google Scholar
Henning, G. (1987). A guide to language testing: Development, evaluation and research. Cambridge, MA: Newbury House.
Google Scholar
Khalifa, H., & Weir, C. J. (2009). Examining reading: Research and practice in assessing second language reading. Cambridge, UK: Cambridge University Press.
Google Scholar
Lynch, T. (2009). Teaching second language listening. Oxford, UK: Oxford University Press.
Google Scholar
Spöttl, C., Eberharter, K., Holzknecht, F., Kremmel, B., & Zehentner, M. (2018). Delivering reform in a high stakes context: From content-based assessment to communicative and competence-based assessment. In G. Sigott (Ed.), Language testing in Austria: Taking stock (pp. 219–240). Berlin: Peter Lang.
Google Scholar
Spöttl, C., Kremmel, B., Holzknecht, F., & Alderson, J. C. (2016). Evaluating the achievements and challenges in reforming a national language exam: The reform team’s perspective. Papers in Language Testing and Assessment, 5(1), 1–22.
Google Scholar
Taylor, L. (2013). Introduction. In A. Geranpayeh & L. B. Taylor (Eds.), Examining listening: Research and practice in assessing second language listening (pp. 1–35). Cambridge, UK: Cambridge University Press.
Google Scholar
Taylor, L., & Geranpayeh, A. (2011). Assessing listening for academic purposes: Defining and operationalising the test construct. Journal of English for Academic Purposes, 10(2), 89–101.
Article Google Scholar
Weiler, T., & Frötscher, D. (2018). Ensuring sustainability and managing quality in producing the standardized matriculation examination in the foreign languages. In G. Sigott (Ed.), Language testing in Austria: Taking stock (pp. 241–260). Berlin: Peter Lang.
Google Scholar
Weir, C. J. (2005). Language testing and validation: An evidence-based approach. Basingstoke: Palgrave Macmillan.
Book Google Scholar

Download references

Author information

Authors and Affiliations

Austrian Federal Ministry of Education, Science and Research, Vienna, Austria
Kristina Leitner
University of Innsbruck, Innsbruck, Austria
Benjamin Kremmel

Authors

Kristina Leitner
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Kremmel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kristina Leitner .

Editor information

Editors and Affiliations

LCC International University, Klaipeda, Lithuania
Betty Lanteigne
Higher Colleges of Technology Dubai, Dubai Men’s College, Dubai, United Arab Emirates
Christine Coombe
University of Hawai‘i at Mānoa, Honolulu, HI, USA
James Dean Brown

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Leitner, K., Kremmel, B. (2021). Avoiding Scoring Malpractice: Supporting Reliable Scoring of Constructed-Response Items in High-Stakes Exams. In: Lanteigne, B., Coombe, C., Brown, J.D. (eds) Challenges in Language Testing Around the World. Springer, Singapore. https://doi.org/10.1007/978-981-33-4232-3_10

Download citation

DOI: https://doi.org/10.1007/978-981-33-4232-3_10
Published: 18 February 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-33-4231-6
Online ISBN: 978-981-33-4232-3
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics