Abstract
Medical schools strive to administer high-quality exams. These exams are often authored and graded by basic science and clinical faculty with content expertise; however, these faculty may lack expertise in question writing as well as familiarity with the psychometrics involved in item analysis. This short communication overviews the background related to multiple choice questions, proposes how to interpret an item analysis report, and makes recommendations for evidence-based grading decisions. The guidelines described here have helped ensure that faculty at one Midwestern medical school appropriately address psychometrically flawed items and make defensible grading decisions.
References
Downing S. The effects of violating standard item writing principles on tests and students: the consequences of using flawed test items on achievement in medical education. Adv in Health Sci Educ. 2005;10:133–43.
Tarrant M, Ware J. Impact of item-writing flaws in multiple-choice questions on student achievement in high-stakes nursing assessments. Med Educ. 2008;42(2):198–206. https://doi.org/10.1111/j.1365-2923.2007.02957.x.
Epstein RM. Assessment in medical education. N Engl J Med. 2007;356(4):387–96. https://doi.org/10.1056/NEJMra054784.
Jozefowicz RF, Koeppen BM, Case S, Galbraith R, Swanson D, Glew RH. The quality of in-house medical school examinations. Acad Med. 2002;77(2):156–61. https://doi.org/10.1097/00001888-200202000-00016.
Tarrant M, Ware J, Mohammed AM. An assessment of functioning and non-functioning distractors in multiple-choice questions: a descriptive analysis. BMC Med Educ. 2009;9(1):40. https://doi.org/10.1186/1472-6920-9-40.
Abdulghai HM, Ahmad F, Irshad M, Salah Khalil M, Al-Shaikh GK, Aldrees AA, et al. Faculty development programs improve the quality of multiple choice questions items’ writing. Sci Reports. 2015;5(1):9556. https://doi.org/10.1038/srep09556.
Downing S. Validity: on the meaningful interpretation of assessment data. Med Educ. 2003;37(9):830–7. https://doi.org/10.1046/j.1365-2923.2003.01594.x.
Rush BR, Rankin DC, White BJ. The impact of item-writing flaws and item complexity on examination item difficulty and discrimination value. BMC Med Educ. 2016;16(1):250. https://doi.org/10.1186/s12909-016-0773-3.
Pais J, Silva A, Guimaraes B, Povo A, Coelho E, Silva-Pereira F, et al. Do item-writing flaws reduce examinations psychometric quality? BMC Res Notes. 2016;9(1):399. https://doi.org/10.1186/s13104-016-2202-4.
Ali SH, Ruit KG. The impact of item flaws, testing at low cognitive level, and low distractor functioning on multiple-choice question quality. Perspect Med Educ. 2015;4(5):244–51. https://doi.org/10.1007/s40037-015-0212-x.
Gajjar S, Sharma R, Kumar P, Rana M. Item and test analysis to identify quality multiple choice questions (MCQs) from an assessment of medical students of Ahmedabad, Gujarat. Indian J Community Med. 2014;39(1):17–20. https://doi.org/10.4103/0970-0218.126347.
Webb EM, Phuong JS, Naeger DM. Does educator training or experience affect the quality of multiple-choice questions? Acad Rad. 2015;22(10):1317–22. https://doi.org/10.1016/j.acra.2015.06.012.
Naeem N, van der Vlueten C, Alfaris EA. Faculty development on item writing substantially improves item quality. Adv in Health Sci Educ. 2012;17(3):369–76. https://doi.org/10.1007/s10459-011-9315-2.
Alamoudi AA, El-Deek BS, Park YS, Al Shawwa LA, Tekian A. Evaluating the long-term impact of faculty development programs on MCQ item analysis. Med Teach. 2017;39(sup1):S45–9. https://doi.org/10.1080/0142159X.2016.1254753.
SAS. Test scoring and analysis using SAS. 2017 https://www.sas.com/storefront/aux/en/sptsiatr/67044_excerpt.pdf. Accessed 18 June 2017.
NCSS. Item analysis in NCSS. 2017 https://www.ncss.com/software/ncss/item-analysis-in-ncss. Accessed 18 June 2017.
Assessment Systems. Iteman - Classical test theory analysis with automated reporting. 2017. http://www.assess.com/iteman. Accessed 18 June 2017.
De Champlain AF. A primer on classical test theory and item response theory for assessments in medical education. Med Educ. 2010;44(1):109–17. https://doi.org/10.1111/j.1365-2923.2009.03425.x.
Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika. 1951;16(3):297–334. https://doi.org/10.1007/BF02310555.
Schuwirth LWT, van der Vleuten CPM. General overview of the theories used in assessment: AMEE Guide No. 57. Med Teach. 2011;33(10):783–97. https://doi.org/10.3109/0142159X.2011.611022.
McCoubrie P. Improving the fairness of multiple-choice questions: a literature review. Med Teach. 2004;26(8):709–12. https://doi.org/10.1080/01421590400013495.
Ilic D, Nordin RB, Glasziou P, Tilson JK, Villanueva E. Development and validation of the ACE tool: assessing medical trainees’ competency in evidence based medicine. BMC Med Educ. 2014;14(1):114–9. https://doi.org/10.1186/1472-6920-14-114.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Bibler Zaidi, N.L., Grob, K.L., Monrad, S.U. et al. Item Quality Improvement: What Determines a Good Question? Guidelines for Interpreting Item Analysis Reports. Med.Sci.Educ. 28, 13–17 (2018). https://doi.org/10.1007/s40670-017-0506-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40670-017-0506-1