Skip to main content
Log in

To trust or not to trust?—teacher marking versus external marking of national tests

  • Published:
Educational Assessment, Evaluation and Accountability Aims and scope Submit manuscript

Abstract

In the Swedish educational system, teachers have the dual responsibility of assigning final grades and marking their own students’ national tests. The Government has mandated the Swedish Schools Inspectorate to remark samples of the national tests to see if teacher marking can be trusted. Reports from this project have concluded that intermarker consistency is low and that teachers’ markings are generous as compared to those of the external markers. These findings have been heavily publicized, leading to distrust in teachers’ assessments. In the article, we analyze and discuss the remarking studies from methodological as well as substantive angles. We conclude that the design applied in the reanalysis does not allow inferences about bias in marking across schools or teachers. We also conclude that there are several alternative explanations for the observation that teacher marks are higher than the external marks: The external markers did not form a representative sample, they read copies with sometimes marginal legibility, and they used a different scale for marking than the teachers had used. The results are thus not as clearcut as suggested by the reports and media releases, which is because a school inspections logic rather than a research logic was applied in designing, conducting, and reporting the studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Braun, H. (1988). Understanding score reliability: Experiments in calibrating essay readers. Journal of Educational Statistics, 13(1), 1–18.

    Article  Google Scholar 

  • Choi, C. C. (1999). Public examinations in Hong Kong. Assessment in Education, 6(3), 405–418.

    Article  Google Scholar 

  • Cliffordson, C. (2004). Betygsinflation i de målrelaterade gymnasiebetygen [Grade inflation in the criterion-referenced grades in upper secondary school]. Pedagogisk Forskning i Sverige, 9(1), 1–14.

    Google Scholar 

  • Elstad, E. (2009). Schools which are named, shamed and blamed by the media: school accountability in Norway. Educational Assessment, Evaluation and Accountability, 21(2), 173–189.

    Article  Google Scholar 

  • Erickson, G. (2009). Nationella prov i engelskaen studie av bedömarsamstämmighet. [National tests of English—a study of inter-rater consistency]. Retrieved 26 September 2011 from http://www.nafs.gu.se/publikationer

  • Gibbons, S., & Marshal, B. (2010). Assessing English: A trial collaborative standardised marking project. English Teaching: Practice and Critique, 9(3), 26–39.

    Google Scholar 

  • Harlen, W. (2005). Trusting teachers’ judgement: Research evidence of the reliability and validity of teachers’ assessment used for summative purposes. Research Papers in Education, 20(3), 245–270.

    Article  Google Scholar 

  • Jacob, B. A., & Levitt, S. D. (2003). Rotten apples: An investigation of the prevalence and predictors of teacher cheating. Quarterly Journal of Economics, 118(3), 843–877.

    Article  Google Scholar 

  • Jayasinghe, U. W., Marsh, H. W., & Bond, N. (2001). Peer review in the funding of research in higher education: The Australian experience. Educational Evaluation and Policy Analysis, 23(4), 343–364.

    Article  Google Scholar 

  • Kane, J. S., Bernardin, H. J., Villanova, P., & Peyrefitte, J. (1995). The stability of rater leniency: Three studies. Academy of Management Journal, 38(4), 1036–1051.

    Article  Google Scholar 

  • Kilpatrick, J., & Johansson, B. (1994). Standardized mathematics testing in Sweden: The legacy of Frits Wigforss. Nordic Studies in Mathematics Education, 2(1), 6–30.

    Google Scholar 

  • Klapp Lekholm, A., & Cliffordson, C. (2008). Discrepancies between school grades and test scores at individual and school level: Effects of gender and family background. Educational Research and Evaluation, 14(2), 181–199.

    Article  Google Scholar 

  • Koretz, D., Stecher, B. M., Klein, S. P., & McCaffrey, D. (1994). The Vermont Portfolio Assessment Program: Findings and implications. Educational Measurement: Issues & Practice, 13, 5–16.

    Article  Google Scholar 

  • McKinstry, B. H., Cameron, H. S., Elton, R.A. & Riley, S. C. (2004). Leniency and halo effects in marking undergraduate short research projects. BMS Medical Education 4, 28.

  • Meadows, M. & Billington, L. (2005). A review of the literature on marking reliability. Report commissioned by the National Assessment Agency, UK

  • National Agency for Education (2007). Provbetyg – Slutbetyg- Likvärdig bedömning? En statistisk analys av sambandet mellan nationella prov och slutbetyg i grundskolans årskurs 9, 1998–2006.[Grades on national tests – Final grades - Equivalent assessment, A statistical analysis of the relationship between national tests and final grades in compulsory school Grade 9, 1998–2006 Rapport 300, Stockholm: National Agency for Education.

  • National Agency for Education. (2008). Central rättning av nationella prov. [Central marking of national tests]. Stockholm: National Agency for Education.

    Google Scholar 

  • National Agency for Education. (2009). Bedömaröverensstämmelse vid bedömning av nationella prov [Inter-rater consistency in marking national tests]. Stockholm: National Agency for Education.

    Google Scholar 

  • National Agency for Education. (2011). Skillnaden mellan betygsresultat på nationella prov och ämnesbetyg i årskurs 9, läsåret 2010/11 [The difference between results on national tests and subject grades in Grade 9, 2010/11. Stockholm: National Agency for Education.

    Google Scholar 

  • OECD. (1998). Education at a glance. OECD indicators 1998. Paris: Centre for Educational Research and Innovation, OECD.

  • OECD. (2011). OECD reviews of evaluation and assessment in education—Sweden. Paris: OECD.

    Google Scholar 

  • Ozga, J., Dahler-Larsen, P., Segerholm, C., & Simola, H. (Eds.). (2011). Fabricating quality in education. Data and governance in Europe. London: Sage.

    Google Scholar 

  • Rönnberg, L. (2011). Exploring the intersection of marketisation and central state control through Swedish national school inspection. Education Inquiry, 2(4), 689–707.

    Google Scholar 

  • Segerholm, C. (2010). Examining outcomes-based educational evaluation through a critical theory lens. In: Freeman, M. (Ed.) Critical social theory and evaluation practice. New Directions for Evaluation, 127, 59–69.

  • SOU (1942:11). Betygsättningen i folkskolan. Betänkande av 1939 års betygssakkunniga. [Grading in elementary school. Report from the 1939 commission on grading] Stockholm: Statens Offentliga Utredningar.

  • SOU (2007:101). Tydlig och öppen. Förslag till stärkt skolinspektion. [Transparent and open. Reinforcing school inspection, commission report]. Stockholm: Utbildningsdepartementet.

  • Swedish Schools Inspectorate (2010). Kontrollrättning av nationella prov i grundskolan och gymnasieskolan. [Control marking of national tests for comprehensive school and upper secondary education]. Retrieved 26 September 2011 from http://www.skolinspektionen.se > Publikationer.

  • Swedish Schools Inspectorate (2011). Lika eller olika? Omrättning av nationella prov i grundskolan och gymnasieskolan. [Remarking of national tests for comprehensive school and upper secondary education]. Retrieved 26 September 2011 from http://www.skolinspektionen.se > Publikationer.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan-Eric Gustafsson.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gustafsson, JE., Erickson, G. To trust or not to trust?—teacher marking versus external marking of national tests. Educ Asse Eval Acc 25, 69–87 (2013). https://doi.org/10.1007/s11092-013-9158-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11092-013-9158-x

Keywords

Navigation