Comparative judgement for assessment

Pollitt, Alastair

doi:10.1007/s10798-011-9189-x

Alastair Pollitt¹

1918 Accesses
64 Citations
6 Altmetric
Explore all metrics

Abstract

Historically speaking, students were judged long before they were marked. The tradition of marking, or scoring, pieces of work students offer for assessment is little more than two centuries old, and was introduced mainly to cope with specific problems arising from the growth in the numbers graduating from universities as the industrial revolution progressed. This paper describes the principles behind the method of Comparative Judgement, and in particular Adaptive Comparative Judgement, a technique borrowed from psychophysics which is able to generate extremely reliable results for educational assessment, and which is based on the kind of holistic evaluation that we assume was the basis for judgement in pre-marking days, and that the users of assessment results expect our assessment schemes to capture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

This description is based mostly on the description of the late nineteenth century Qing system by Miyazaki (1963).

References

Adams, R. M. (1995). Analysing the results of cross-moderation studies. Paper presented at a seminar on comparability, held jointly by the SRAC of the GCE boards and the IGRC of the GCSE groups, London, October.
Adams, R. (2007). Cross-moderation methods. In P. Newton, J. Baird, H. Patrick, H. Goldstein, P. Timms, & A. Wood (Eds.), Techniques for monitoring the comparability of examination standards. London, QCA. Available (26/09/2011) at: http://www.ofqual.gov.uk/files/2007-comparability-exam-standards-h-chapter6.pdf.
Anastasi, A. (1986). Evolving concepts of test validation. Annual Review of Psychology, 37, 1–15.
Article Google Scholar
Andrich, D. (1978). Relationships between the Thurstone and Rasch approaches to item scaling. Applied Psychological Measurement, 2, 451–462.
Article Google Scholar
Baker, E. L., Ayers, P., O’Neill, H. F., Choi, K., Sawyer, W., Sylvester., R. M., & Carroll, B. (2008). KS3 English test marker study in Australia. Final report to the National Assessment Agency of England, London, QCA.
Bramley, T., (2007). Paired comparison methods. In P. Newton, J. Baird, H. Patrick, H. Goldstein, P. Timms, & A. Wood (Eds). Techniques for monitoring the comparability of examination standards. London, QCA. Available (26/09/2011) at: http://www.ofqual.gov.uk/files/2007-comparability-exam-standards-i-chapter7.pdf.
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley Lawrence Erlbaum Associates.
Google Scholar
D’Arcy, J. (Ed.). (1997). Comparability studies between modular and non-modular syllabuses in GCE Advanced level biology, English literature and mathematics in the 1996 summer examinations. Standing Committee on Research on behalf of the Joint Forum for the GCSE and GCE.
Haley, C., & Wothers, P. (2005). In M. D. Archer, & C. D. Haley (Eds.), The 1702 chair of chemistry at Cambridge. Cambridge: CUP.
Kelly, G. A. (1955). The psychology of personal constructs (Vol. I and II). New York: Norton.
Google Scholar
Kimbell, R., Wheeler, T., Stables, K., Sheppard, T., Martin, F., Davies, D., Pollitt, A., & Whitehouse, G. (2009). e-scape portfolio assessment: phase 3 report. London: Technology Education Research Unit, Goldsmiths, UL. http://www.gold.ac.uk/teru/projectinfo/projecttitle,5882,en.php.
Linacre, J. M. (1994). Many-facet Rasch measurement, 2nd ed. Chicago: MESA Press.http://www.rasch.org/books.htm.
Miyazaki, I. (1963). China’s examination hell: the civil service examinations of imperial China (C. Schirokauer (1976), Trans). New York: Weatherhill.
Pollitt, A., & Elliott, G. (2003). Monitoring and investigating comparability: A proper role for human judgement. Invited paper, QCA comparability seminar, Newport Pagnall. Qualifications and curriculum authority, London. Available at: http://www.camexam.co.uk/.
Pollitt, A., & Murray, N. L. (1993). What raters really pay attention to language testing research colloquium, Cambridge (Reprinted from M. Milanovic, & N. Saville (Eds.), 1996, Studies in language testing 3: Performance testing, cognition and assessment. Cambridge: Cambridge University Press).
QCDA. (2011). Importance of design and technology key stage 3. http://www.education.gov.uk/schools/teachingandlearning/curriculum/secondary/b00199489/dt/programme. Accessed: 9 Dec 2011.
Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research. Reprinted as 2nd ed., 1980, Chicago: University of Chicago Press.
Shavelson, R., & Webb, N. (2000). Generalizability theory. In J. L. Green, G. Camilli, & P. B. Elmore (Eds.), Handbook of complementary methods in education research, Chapter 18. London: Lawrence Erlbaum Associates.
Google Scholar
Stray, C. (2001). The shift from oral to written examinations: Cambridge and Oxford 1700–1900. Assessment in Education, 8, 33–50.
Article Google Scholar
Thurstone, L. L. (1927a). Psychophysical analysis. The American Journal of Psychology, 38, 368–389.
Article Google Scholar
Thurstone, L. L. (1927b). A law of comparative judgment. Psychological Review, 34, 273–286 (Reprinted as Chapter 3 from Thurstone, L. L. (1959). The measurement of values. Chicago, IL: University of Chicago Press).
Wainer, H. (Ed.). (2000). Computerized adaptive testing: A primer (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Google Scholar
Watson, R. (1818). Anecdotes of the life of Richard Watson … written by himself at different intervals, and revised in 1814. Published by his son, Richard Watson, L. L. B., prebendary of Landaff and Wells. London: T. Cadell and W. Davies.
Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21, 361–375.
Article Google Scholar
Wordsworth, C. (1877). Scholae academicae. London: Frank Cass.
Google Scholar
Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago: MESA Press. http://www.rasch.org/books.htm.

Download references

Author information

Authors and Affiliations

Cambridge Exam Research, Cambridge, UK
Alastair Pollitt

Authors

Alastair Pollitt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alastair Pollitt.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pollitt, A. Comparative judgement for assessment. Int J Technol Des Educ 22, 157–170 (2012). https://doi.org/10.1007/s10798-011-9189-x

Download citation

Published: 14 December 2011
Issue Date: May 2012
DOI: https://doi.org/10.1007/s10798-011-9189-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative judgement for assessment

Abstract

Access this article

Similar content being viewed by others

The problem of assessing problem solving: can comparative judgement help?

Assessment by Comparative Judgement: An Application to Secondary Statistics and English in New Zealand

Using Professional Judgement To Equate Exam Standards

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Comparative judgement for assessment

Abstract

Access this article

Similar content being viewed by others

The problem of assessing problem solving: can comparative judgement help?

Assessment by Comparative Judgement: An Application to Secondary Statistics and English in New Zealand

Using Professional Judgement To Equate Exam Standards

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation