Extending participation in standard setting: an online judging proposal
In order for standard setting to retain public confidence, it will be argued there are two important requirements. One is that the judges’ allocation of students to performance bands would yield results broadly consistent with the expectation of the wider educational community. Secondly, in the absence of any change in educational performance, that the percentages in the corresponding bands should be stable over time. It is argued that the use of a small team of judges makes it more difficult to satisfy these conditions. However the cost and logistics of organizing a larger number of judges in the time-pressured atmosphere of public examining can lead to sub-optimal standard setting. Two parallel systems of awarding performance bands are empirically compared, one based on teams of six judges, the other based on a population of teachers. It is shown that the latter system gives more stable results over time for the same large student population. A proposal is outlined for extending participation in standard setting through the web-based presentation of materials and the capturing of cutscores from a population of teachers.
KeywordsStandard setting Cutscores Angoff method Bookmark method Standard error Online judging
- Angoff, W. H. (1971). Scales, norms and equivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 508–600). Washington, D.C.: American Council on Education.Google Scholar
- Board of Studies NSW. (2007). The standards-setting operation: handbook for judges. Sydney: Board of Studies NSW.Google Scholar
- Buckendahl, C. W., Blackhurst, A., & Rodeck, E. (2006). Adaptation within a language: considerations for standard setting. Paper presented at the International Test Commission conference, Brussels, Belgium, July 6–8, 2006.Google Scholar
- Hambleton, R. K. (2001). Setting performance standards on educational assessments and criteria for evaluating the process. In G. J. Cizek (Ed.), Setting performance standards: Concepts, methods and perspectives (pp. 89–116). Mahwah: Lawrence Erlbaum Associates.Google Scholar
- Hambleton, R. K., & Pitoniak, M. J. (2006). Setting performance standards. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 433–470). Washington, DC: American Council on Education.Google Scholar
- Jaeger, R. (1982). An iterative structured judgment process for establishing standards on competency tests of theory and application. Educational Evaluation and Policy Analysis, 4, 461–475.Google Scholar
- MacCann, R. G. (2008b). The application of computer-based testing to large scale assessment programs. In T. B. Scott & J. I. Livingston (Eds.), Leading-edge educational technology (pp. 1–47). New York: Nova Science.Google Scholar
- MacCann, R. G., & Stanley, G. (2004). Estimating the standard error of the judging in a modified-Angoff standards setting procedure. Practical Assessment Research and Evaluation, 9(5). Retrieved 1 July, 2009 from http://pareonline.net/getvn.asp?v=9&n=5.
- MacCann, R. G., & Stanley, G. (2006). The use of Rasch modeling to improve standard setting. Practical Assessment Research and Evaluation, 11(2). Retrieved 1 July, 2009 from http://pareonline.net/pdf/v11n2.pdf.
- Mitzel, H. C., Lewis, D. M., Patz, R. J., & Green, D. R. (2001). The bookmark procedure: psychological perspectives. In G. J. Cizek (Ed.), Setting performance standards (pp. 249–281). Mahwah: Lawrence Erlbaum.Google Scholar
- Näsström, G., & Nyström, P. (2008). A comparison of two different methods for setting performance standards for a test with constructed-response items. Practical Assessment Research and Evaluation, 13(9). Retrieved 1 July 2009 at: http://pareonline.net/getvn.asp?v=13&n=9.