Abstract
Historically speaking, students were judged long before they were marked. The tradition of marking, or scoring, pieces of work students offer for assessment is little more than two centuries old, and was introduced mainly to cope with specific problems arising from the growth in the numbers graduating from universities as the industrial revolution progressed. This paper describes the principles behind the method of Comparative Judgement, and in particular Adaptive Comparative Judgement, a technique borrowed from psychophysics which is able to generate extremely reliable results for educational assessment, and which is based on the kind of holistic evaluation that we assume was the basis for judgement in pre-marking days, and that the users of assessment results expect our assessment schemes to capture.
Similar content being viewed by others
Notes
This description is based mostly on the description of the late nineteenth century Qing system by Miyazaki (1963).
References
Adams, R. M. (1995). Analysing the results of cross-moderation studies. Paper presented at a seminar on comparability, held jointly by the SRAC of the GCE boards and the IGRC of the GCSE groups, London, October.
Adams, R. (2007). Cross-moderation methods. In P. Newton, J. Baird, H. Patrick, H. Goldstein, P. Timms, & A. Wood (Eds.), Techniques for monitoring the comparability of examination standards. London, QCA. Available (26/09/2011) at: http://www.ofqual.gov.uk/files/2007-comparability-exam-standards-h-chapter6.pdf.
Anastasi, A. (1986). Evolving concepts of test validation. Annual Review of Psychology, 37, 1–15.
Andrich, D. (1978). Relationships between the Thurstone and Rasch approaches to item scaling. Applied Psychological Measurement, 2, 451–462.
Baker, E. L., Ayers, P., O’Neill, H. F., Choi, K., Sawyer, W., Sylvester., R. M., & Carroll, B. (2008). KS3 English test marker study in Australia. Final report to the National Assessment Agency of England, London, QCA.
Bramley, T., (2007). Paired comparison methods. In P. Newton, J. Baird, H. Patrick, H. Goldstein, P. Timms, & A. Wood (Eds). Techniques for monitoring the comparability of examination standards. London, QCA. Available (26/09/2011) at: http://www.ofqual.gov.uk/files/2007-comparability-exam-standards-i-chapter7.pdf.
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley Lawrence Erlbaum Associates.
D’Arcy, J. (Ed.). (1997). Comparability studies between modular and non-modular syllabuses in GCE Advanced level biology, English literature and mathematics in the 1996 summer examinations. Standing Committee on Research on behalf of the Joint Forum for the GCSE and GCE.
Haley, C., & Wothers, P. (2005). In M. D. Archer, & C. D. Haley (Eds.), The 1702 chair of chemistry at Cambridge. Cambridge: CUP.
Kelly, G. A. (1955). The psychology of personal constructs (Vol. I and II). New York: Norton.
Kimbell, R., Wheeler, T., Stables, K., Sheppard, T., Martin, F., Davies, D., Pollitt, A., & Whitehouse, G. (2009). e-scape portfolio assessment: phase 3 report. London: Technology Education Research Unit, Goldsmiths, UL. http://www.gold.ac.uk/teru/projectinfo/projecttitle,5882,en.php.
Linacre, J. M. (1994). Many-facet Rasch measurement, 2nd ed. Chicago: MESA Press.http://www.rasch.org/books.htm.
Miyazaki, I. (1963). China’s examination hell: the civil service examinations of imperial China (C. Schirokauer (1976), Trans). New York: Weatherhill.
Pollitt, A., & Elliott, G. (2003). Monitoring and investigating comparability: A proper role for human judgement. Invited paper, QCA comparability seminar, Newport Pagnall. Qualifications and curriculum authority, London. Available at: http://www.camexam.co.uk/.
Pollitt, A., & Murray, N. L. (1993). What raters really pay attention to language testing research colloquium, Cambridge (Reprinted from M. Milanovic, & N. Saville (Eds.), 1996, Studies in language testing 3: Performance testing, cognition and assessment. Cambridge: Cambridge University Press).
QCDA. (2011). Importance of design and technology key stage 3. http://www.education.gov.uk/schools/teachingandlearning/curriculum/secondary/b00199489/dt/programme. Accessed: 9 Dec 2011.
Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research. Reprinted as 2nd ed., 1980, Chicago: University of Chicago Press.
Shavelson, R., & Webb, N. (2000). Generalizability theory. In J. L. Green, G. Camilli, & P. B. Elmore (Eds.), Handbook of complementary methods in education research, Chapter 18. London: Lawrence Erlbaum Associates.
Stray, C. (2001). The shift from oral to written examinations: Cambridge and Oxford 1700–1900. Assessment in Education, 8, 33–50.
Thurstone, L. L. (1927a). Psychophysical analysis. The American Journal of Psychology, 38, 368–389.
Thurstone, L. L. (1927b). A law of comparative judgment. Psychological Review, 34, 273–286 (Reprinted as Chapter 3 from Thurstone, L. L. (1959). The measurement of values. Chicago, IL: University of Chicago Press).
Wainer, H. (Ed.). (2000). Computerized adaptive testing: A primer (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Watson, R. (1818). Anecdotes of the life of Richard Watson … written by himself at different intervals, and revised in 1814. Published by his son, Richard Watson, L. L. B., prebendary of Landaff and Wells. London: T. Cadell and W. Davies.
Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21, 361–375.
Wordsworth, C. (1877). Scholae academicae. London: Frank Cass.
Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago: MESA Press. http://www.rasch.org/books.htm.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pollitt, A. Comparative judgement for assessment. Int J Technol Des Educ 22, 157–170 (2012). https://doi.org/10.1007/s10798-011-9189-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10798-011-9189-x