Advances in Health Sciences Education

, Volume 16, Issue 2, pp 211–221 | Cite as

Pick-N multiple choice-exams: a comparison of scoring algorithms

  • Daniel BauerEmail author
  • Matthias Holzer
  • Veronika Kopp
  • Martin R. Fischer


To compare different scoring algorithms for Pick-N multiple correct answer multiple-choice (MC) exams regarding test reliability, student performance, total item discrimination and item difficulty. Data from six 3rd year medical students’ end of term exams in internal medicine from 2005 to 2008 at Munich University were analysed (1,255 students, 180 Pick-N items in total). Scoring Algorithms: Each question scored a maximum of one point. We compared: (a) Dichotomous scoring (DS): One point if all true and no wrong answers were chosen. (b) Partial credit algorithm 1 (PS50): One point for 100% true answers; 0.5 points for 50% or more true answers; zero points for less than 50% true answers. No point deduction for wrong choices. (c) Partial credit algorithm 2 (PS1/m): A fraction of one point depending on the total number of true answers was given for each correct answer identified. No point deduction for wrong choices. Application of partial crediting resulted in psychometric results superior to dichotomous scoring (DS). Algorithms examined resulted in similar psychometric data with PS50 only slightly exceeding PS1/m in higher coefficients of reliability. The Pick-N MC format and its scoring using the PS50 and PS1/m algorithms are suited for undergraduate medical examinations. Partial knowledge should be awarded in Pick-N MC exams.


Undergraduate assessment Multiple choice Scoring Test psychometrics Pick-N Undergraduate medical education 



The authors would like to extend their gratitude towards René Krebs of Berne University, Switzerland and Andreas Möltner of University of Heidelberg, Germany for their helpful suggestions.


  1. Albanese, M. A. (1993). Type K and other complex multiple-choice items: An analysis of research and item properties. Educational Measurement: Issues and Practice, 12(1), 28–33.CrossRefGoogle Scholar
  2. Albanese, M. A., & Sabers, D. L. (1988). Multiple true-false items: A study of interitem correlations, scoring alternatives, and reliability estimation. Journal of Educational Measurement, 25(2), 111–123.CrossRefGoogle Scholar
  3. Ben-Simon, A., Budescu, D. V., & Nevo, B. (1997). A comparative study of measures of partial knowledge in multiple-choice tests. Applied Psychological Measurement, 21(1), 65–88.CrossRefGoogle Scholar
  4. Beullens, J., Struyf, E., & van Damme, B. (2005). Do extended matching multiple-choice questions measure clinical reasoning? Medical Education, 39(4), 410–417.CrossRefGoogle Scholar
  5. Beullens, J., Struyf, E., & van Damme, B. (2006). Diagnostic ability in relation to clinical seminars and extended-matching questions examinations. Medical Education, 40(12), 1173–1179.CrossRefGoogle Scholar
  6. Bland, A. C., Kreiter, C. D., & Gordon, J. A. (2005). The psychometric properties of five scoring methods applied to the script concordance test. Academic Medicine, 80(4), 395–399.CrossRefGoogle Scholar
  7. Case, S. M., & Swanson, D. B. (1993). Extended-matching items: A practical alternative to free-response questions. Teaching and Learning in Medicine, 5(2), 107–115.CrossRefGoogle Scholar
  8. Case, S. M., & Swanson, D. B. (2001). Constructing written test questions for the basic and clinical sciences. Philadelphia: National Board of Medical Examiners.Google Scholar
  9. Coderre, SP., Harasym, P., Mandin, H., & Fick, G. (2004). The impact of two multiple-choice question formats on the problem-solving strategies used by novices and experts. BMC Medical Education. 4(23).Google Scholar
  10. Downing, S. M. (2004). Reliability: On the reproducibility of assessment data. Medical Education, 38(9), 1006–1012.CrossRefGoogle Scholar
  11. Epstein, R. M. (2007). Assessment in medical education. New England Journal of Medicine, 356, 387–396.CrossRefGoogle Scholar
  12. Fournier, J. P., Demeester, A., & Charlin, B. (2008). Script concordance tests: Guidelines for construction. BMC Medical Informatics and Decision Making. 8(18).Google Scholar
  13. Haladyna, T. M., & Downing, S. M. (1989a). A taxonomy of multiple-choice item-writing rules. Applied Measurement in Education, 2(1), 37–50.CrossRefGoogle Scholar
  14. Haladyna, T. M., & Downing, S. M. (1989b). Validity of a taxonomy of multiple-choice item-writing rules. Applied Measurement in Education, 2(1), 51–78.CrossRefGoogle Scholar
  15. Haladyna, T. M., & Downing, S. M. (1993). How many options is enough for a multiple-choice test item? Educational and Psychological Measurement, 53(4), 999–1010.CrossRefGoogle Scholar
  16. Haladyna, T. M., Downing, S. M., & Rodriguez, M. C. (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied Measurement in Education, 15(3), 309–333.CrossRefGoogle Scholar
  17. Itten, S., & Krebs, R. (1997). Messqualität der verschiedenen MC-Itemtypen in den beiden Vorprüfungen des Medizinstudiums an der Universität Bern 1997/2 (Forschungsbericht Institut für Aus-, Weiter-und Fortbildung (IAWF) der medizinischen Fakultät der Universität Bern). Bern: IAWF.Google Scholar
  18. Kassirer, J. P., & Kopelman, R. I. (1991). Learning clinical reasoning. Baltimore: Williams & Wilkins.Google Scholar
  19. Krebs, R. (2004). Anleitung zur Herstellung von MC-Fragen und MC-Prüfungen für die ärztliche Ausbildung. Institut für Medizinische Lehre IML, Abteilung für Ausbildungs- und Examensforschung AAE, Bern.Google Scholar
  20. Lord, F. M. (1963). Formula scoring and validity. Educational and Psychological Measurement, 23(4), 663–672.CrossRefGoogle Scholar
  21. Möltner, A., Schellberg, D., & Jünger, J. (2006). Grundlegende quantitative Analysen medizinischer Prüfungen. GMS Zeitschrift für Medizinische Ausbildung. 23(3).Google Scholar
  22. Nendaz, M. R., & Tekian, A. (1999). Assessment in problem-based learning medical schools: A literature review. Teaching and Learning in Medicine, 11(4), 232–243.CrossRefGoogle Scholar
  23. Norcini, J. J., Swanson, D. B., Grosso, L. J., Shea, J. A., & Webster, G. D. (1984). A comparison of knowledge, synthesis, and clinical judgment: Multiple-choice questions in the assessment of physician competence. Evaluation & the Health Professions., 7(4), 485–499.CrossRefGoogle Scholar
  24. Ripkey, D. R., Case, S. M., & Swanson, D. B. (1996). A “new” item format for assessing aspects of clinical competence. Academic Medicine, 71(10), S34–S36.CrossRefGoogle Scholar
  25. Rodriguez, M. C. (2005). Three options are optimal for multiple-choice items: A meta-analysis of 80 years of research. Educational Measurement: Issues and Practice, 24, 3–13.CrossRefGoogle Scholar
  26. Rotthoff, T., Baehring, T., Dicken, H. D., Fahron, U., Richter, B., Fischer, M., & Scherbaum, W. (2006). Comparison between long-menu and open-ended questions in computerized medical assessments. A randomized controlled trial. BMC Medical Education 6(50).Google Scholar
  27. Schuwirth, L. W. T., & van der Vleuten, C. P. M. (2003). ABC of learning and teaching in medicine: Written assessment. British Medical Journal, 326, 643–645.CrossRefGoogle Scholar
  28. Swanson, D. B., Holtzman, K. Z., & Allbee, K. (2008). Measurement characteristics of content-parallel single-best-answer and extended-matching questions in relation to number and source of options. Academic Medicine, 83(10), 21–24.CrossRefGoogle Scholar
  29. Wakeford, R. E., & Roberts, S. (1984). Short answer questions in an undergraduate qualifying examination: A study of examiner variability. Medical Education, 18(3), 168–173.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2010

Authors and Affiliations

  • Daniel Bauer
    • 1
    Email author
  • Matthias Holzer
    • 2
  • Veronika Kopp
    • 1
  • Martin R. Fischer
    • 1
    • 2
  1. 1.Faculty of Health, Institute for Teaching and Educational Research in Health SciencesWitten/Herdecke UniversityWittenGermany
  2. 2.Medizinische Klinik—Innenstadt, Medical Education UnitMunich University HospitalMunichGermany

Personalised recommendations