Skip to main content

Avoiding Scoring Malpractice: Supporting Reliable Scoring of Constructed-Response Items in High-Stakes Exams

  • Chapter
  • First Online:
  • 583 Accesses

Abstract

Scoring reliability of constructed-response items is a key concern in high-stakes testing. Constructed-response items, often used for their authenticity, potentially allow for a multitude of acceptable answers that were neither intended nor anticipated, and can therefore be problematic for reliable scoring. This chapter examines the use of a specially developed marker support system for the Austrian EFL school-leaving exam, which uses such items but without centralized marking and therefore potentially suffers from inconsistent scoring that could affect 40,000 students annually. The study investigates the impact of three different scoring guide conditions on test taker results in four constructed-response tasks for listening at CEFR B2 level. The first scoring condition (A) is exact scoring based on the scoring guide developed by the item writing team before the task had been field-tested. The second scoring condition (B) is based on an extended scoring guide that was improved in a centrally run scoring session after piloting the items. The third scoring condition (C) is based on the highly comprehensive scoring guide that was enhanced during the scoring of the national live exam through a marker support system in the form of an online helpdesk and a telephone hotline. The statistical analyzes show an overall improvement in the reliability of the test from scoring condition A to scoring condition C. Consequently, the findings of the study suggest that the practice of improving and refining the scoring guides through the implemented marker support system increase the comparability, reliability, and fairness in test taker scores.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Alderson, J. C. (2000). Assessing reading. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice: Developing language assessments and justifying their use in the real world. Oxford, UK: Oxford University Press.

    Google Scholar 

  • BMUKK [Bundesministerium für Unterricht, Kunst und Kultur]. (2004). Oberstufenlehrplan für die Erste und Zweite Lebende Fremdsprache für Allgemein Bildende Höhere Schulen. http://www.bmukk.gv.at/medienpool/11854/lebendefremdsprache_ost_neu0.pdf. Accessed 13 October 2013.

  • Brindley, G. (1998). Assessing listening abilities. Annual Review of Applied Linguistics, 18, 171–191.

    Article  Google Scholar 

  • Brown, G., & Yule, G. (1983). Teaching the spoken language. Cambridge: Cambridge University Press.

    Google Scholar 

  • Buck, G. (2001). Assessing listening. Cambridge, UK: Cambridge University Press.

    Book  Google Scholar 

  • Council of Europe. (2001). Common European framework of reference for languages: Learning, teaching, assessment. http://www.coe.int/t/dg4/linguistic/Source/Framework_EN.pdf. Accessed 2 November 2013.

  • Eberharter, K., & Frötscher, D. (2013). Quality control in marking open-ended listening and reading test items. In D. Tsagari, S. Papadima-Sophocleous, & S. Ioannou-Georgiou (Eds.), International experiences in language testing and assessment: Selected papers in memory of Pavlos Pavlou (pp. 229–242). Frankfurt: Peter Lang.

    Google Scholar 

  • Elliott, W., & Wilson, J. (2013). Context validity. In A. Geranpayeh & L. B. Taylor (Eds.), Examining listening: Research and practice in assessing second language listening (pp. 152–241). Cambridge, UK: Cambridge University Press.

    Google Scholar 

  • Field, J. (2013). Cognitive validity. In A. Geranpayeh & L. B. Taylor (Eds.), Examining listening: Research and practice in assessing second language listening (pp. 77–151). Cambridge, UK: Cambridge University Press.

    Google Scholar 

  • Green, R. (2013). Statistical analyses for language testing. Basingstoke: Palgrave Macmillan.

    Book  Google Scholar 

  • Hackett, E., Geranpayeh, A., & Somers, A. (2006). Listening skills group spelling project: Investigating the impact of the revision of an FCE 4 productive task mark scheme based on the recommendations of four external consultants (Cambridge ESOL Internal Report).

    Google Scholar 

  • Harding, L., Pill, J., & Ryan, K. (2011). Assessor decision making while marking a note-taking listening test: The case of the OET. Language Assessment Quarterly, 8(2), 108–126.

    Article  Google Scholar 

  • Harding, L., & Ryan, K. (2009). Decision making in marking open-ended listening test items: The case of the OET. Spaan Fellow Working Papers in Second or Foreign Language Assessment, 7, 99–114.

    Google Scholar 

  • Henning, G. (1987). A guide to language testing: Development, evaluation and research. Cambridge, MA: Newbury House.

    Google Scholar 

  • Khalifa, H., & Weir, C. J. (2009). Examining reading: Research and practice in assessing second language reading. Cambridge, UK: Cambridge University Press.

    Google Scholar 

  • Lynch, T. (2009). Teaching second language listening. Oxford, UK: Oxford University Press.

    Google Scholar 

  • Spöttl, C., Eberharter, K., Holzknecht, F., Kremmel, B., & Zehentner, M. (2018). Delivering reform in a high stakes context: From content-based assessment to communicative and competence-based assessment. In G. Sigott (Ed.), Language testing in Austria: Taking stock (pp. 219–240). Berlin: Peter Lang.

    Google Scholar 

  • Spöttl, C., Kremmel, B., Holzknecht, F., & Alderson, J. C. (2016). Evaluating the achievements and challenges in reforming a national language exam: The reform team’s perspective. Papers in Language Testing and Assessment, 5(1), 1–22.

    Google Scholar 

  • Taylor, L. (2013). Introduction. In A. Geranpayeh & L. B. Taylor (Eds.), Examining listening: Research and practice in assessing second language listening (pp. 1–35). Cambridge, UK: Cambridge University Press.

    Google Scholar 

  • Taylor, L., & Geranpayeh, A. (2011). Assessing listening for academic purposes: Defining and operationalising the test construct. Journal of English for Academic Purposes, 10(2), 89–101.

    Article  Google Scholar 

  • Weiler, T., & Frötscher, D. (2018). Ensuring sustainability and managing quality in producing the standardized matriculation examination in the foreign languages. In G. Sigott (Ed.), Language testing in Austria: Taking stock (pp. 241–260). Berlin: Peter Lang.

    Google Scholar 

  • Weir, C. J. (2005). Language testing and validation: An evidence-based approach. Basingstoke: Palgrave Macmillan.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kristina Leitner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Leitner, K., Kremmel, B. (2021). Avoiding Scoring Malpractice: Supporting Reliable Scoring of Constructed-Response Items in High-Stakes Exams. In: Lanteigne, B., Coombe, C., Brown, J.D. (eds) Challenges in Language Testing Around the World. Springer, Singapore. https://doi.org/10.1007/978-981-33-4232-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-981-33-4232-3_10

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-33-4231-6

  • Online ISBN: 978-981-33-4232-3

  • eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics