Abstract
The practice of medicine depends on qualified doctors who strive to achieve and maintain appropriate knowledge, skills, and attitude. Competent doctors are the need of the hour and hence tests of clinical competence, which allow a decision to be made about whether or not a doctor is fit to practice are in demand. This demand poses a challenge for all involved in medical education. Therefore, assessment and evaluation of medical trainees play important role in choosing good doctors.
Examinations are formidable even to the best prepared, for the greatest fool may ask more than the wisest man can answer.
—Charles Caleb Colton, English, author, philosopher and eccentric (1780–1832)
You have full access to this open access chapter, Download chapter PDF
Similar content being viewed by others
1 Why Must We Develop Assessment Methods?
The practice of medicine depends on qualified doctors who strive to achieve and maintain appropriate knowledge, skills, and attitude. Competent doctors are the need of the hour and hence tests of clinical competence, which allow a decision to be made about whether or not a doctor is fit to practice are in demand. This demand poses a challenge for all involved in medical education. Therefore, assessment and evaluation of medical trainees play important roles in choosing good doctors.
Assessment promotes learning and for this assessment needs to be educational, formative, and also summative. Students learn from the assessment process and receive feedback on which they build their knowledge and skills. Wass et al. pragmatically describe assessment to be the most appropriate engine on which the curriculum is harnessed. They feel assessment should not only aim at certification and exclusion but also influence the learning process [1].
Neufeld and Norman have listed key measurement issues that should be addressed when designing assessments of clinical competencies [2].
![figure a](http://media.springernature.com/lw685/springer-static/image/chp%3A10.1007%2F978-981-16-5248-6_41/MediaObjects/500844_1_En_41_Figa_HTML.png)
To assess skill, knowledge, and attitude in medicine, a combination of assessment techniques is always required. Selecting an assessment technique not only depends on measuring students’ performance but also needs to address issues like cost, suitability, and safety. These play a major role in inter-institutional variations in selecting assessment methods and success [3].
2 Are Objective Assessment Methods Reliable and Valid?
Reliability measures the reproducibility or consistency of a test. It is affected by examiner judgments, types of cases, nervousness of candidates, and test conditions. Two important aspects of reliability are inter-rater reliability and inter-case (candidate) reliability. Inter-rater reliability measures how consistent different examiners are in rating candidates’ performance. The use of multiple examiners improves interrater reliability [4]. Intercase reliability which measures consistency of candidate performance across different cases is the most important aspect of testing clinical competence. Multiple sampling across many cases improves intercase reliability as compared to candidates assessed on a single case (Fig. 41.1). Clinical skill testing has now moved to the multicase format with increasing use of assessment techniques like the objective structured clinical examination (OSCE). OSCE consists of multiple tasks in multiple stations with sufficient testing time which helps achieve adequate intercase reliability. Length also plays a critical role in determining reliability [5].
Validity determines if a test actually succeeds in testing the competencies that it is designed to test. As valid measures of clinical competence are absent, Miller introduced the concept of pyramid of competence (Fig. 41.2). It is a conceptual model outlining issues involved in analyzing validity.
All facets essential for clinical competence are covered by the pyramid. The base of the pyramid represents the knowledge components of competence: knows (basic facts) followed by knows how (applied knowledge). These are easily assessed by written tests of clinical knowledge like multiple-choice questions. Assessment of competency of a qualifying doctor needs evaluation of more important facet which is ‘shows how’. It evaluates behavioural function which involves hands-on demonstration. Ultimate valid assessment of clinical competence is to test a doctor’s actual performance, which involves assessing the summit of the pyramid using modalities like OSCE [6].
3 Is Traditional Assessment Inferior to Objective Assessment?
Medical education facilitates learning and encourages acquiring factual knowledge, improving professional skills and developing skills of application like critical reflection, problem solving, and reasoning. Until recently assessments of medical students depended on traditional methods like essay type questions and long case/viva voces, which typically required students to memorize large amounts of content without needing to apply it.
Unfortunately what and how a student learns depends on how he/she thinks will be assessed. And the use of traditional assessment methods leads a student to memorize and reproduce factual information in order to get a good grade and much of this information is forgotten within a week. It also relies on the assessment by examiners with different teaching experiences that leads to increased subjectivity and reduced reliability of the examination [7].
The merits and demerits of traditional assessment methods can be summarized as follows [8]:
Merits
-
Global judgement of the skills of the student.
-
No compartmentalization of the clinical skills to be judged.
-
Less time consuming.
-
Less effort in organization and conduction of the examination.
-
More interaction between examiners and examinees.
Demerits
-
Biased system hence less valid and reliable.
-
Lacks the structure and uniformity to be used as an assessment tool.
-
Affective skills like communication, history taking are not judged.
-
Requires experienced faculty for the judgement of student’s performance.
These limitations have led to a search for an objective, structured, and unbiased assessment tool that is reliable and valid. Objective assessment methods like multiple-choice question’s (MCQ’s), objective structured practical examination (OSPE), and objective structured practical examination (OSPE) have helped address these issues and have now largely replaced traditional assessment methods.
4 Multiple-Choice Questions (MCQs): What, Why, and When?
Multiple-Choice Questions (MCQs) have now become the most widely applicable, useful and accepted type of objective assessment. They help assess all important facets of educational outcomes which are knowledge, understanding, judgement, and problem solving. Introduced in medical education since the 1950s, MCQs have now become well established as a reliable examination tool in assessments of both undergraduate and post-graduate students. The MCQ which has become synonymous with objective evaluation and consists of questions for which there is a prior agreement on what constitutes the correct answers [9].
MCQs are reliable and easy to score. They also help in wide sampling of knowledge in a limited time. Through a short and time-efficient examination, the length and breadth of any topic are assessed. The beauty of this assessment method is that, apart from the recall of isolated facts, it also helps to assess taxonomically higher-order cognitive processing such as interpretation, synthesis, and application of knowledge [10]. Apart from reliability MCQs are also discriminatory, reproducible, and cost effective. There is a general consensus that rather than using MCQs as a sole method of examination it can be used alongside other evaluation methods to broaden the range of skills to be assessed in medical education [11].
Even though considerable effort goes into framing MCQs, their high objectivity helps in the immediate release of results as they can be marked by any person or machine. They also allow easy collection and analysis of raw data and also comparisons with past performances. Another advantage is their ability to assess a large number of candidates easily and making use of computers [12, 13].
Medical teachers are often faced with the challenging task of constructing good MCQs that test higher-order thinking skills. Unfortunately, they often have little or no experience or training in constructing MCQs. Preparing a good MCQ is difficult and time consuming. Most institutes now emphasize on faculty development programmes that concentrate on MCQ construction and implementation.
![figure b](http://media.springernature.com/lw685/springer-static/image/chp%3A10.1007%2F978-981-16-5248-6_41/MediaObjects/500844_1_En_41_Figb_HTML.png)
5 How to Design a Good MCQ?
MCQs can be prepared in different patterns. Commonly used formats are ‘one correct answer’, single or one best response, ‘true or false’ and ‘multiple true or false’, ‘matching’, and the ‘extended matching questions or items types [14]. Single best options are the most common format of MCQs which are widely used and accepted.
Before preparing an MCQ one must consider objectives that need to be sampled and areas to be tested. Learning outcome also needs to be determined before sampling to ensure the high validity of the test. Another important issue to be addressed is learning objectives which the learners are expected to achieve. Learning objectives can be formulated to SMART an acronym representing goals that are specific, measurable, attainable, realistic, and time bound [15,16,17].
Benjamin Bloom was an educational psychologist who divided what and how we learn into three separate domains of learning [18]:
-
1.
Cognitive domain—related to thinking/knowledge (K).
-
2.
Affective domain—related to feeling/attitudes (A).
-
3.
Psychomotor domain—related to doing/skills or practice (P).
In 1956 he also published a taxonomy of cognitive learning, described as a hierarchy of (i) knowledge, (ii) comprehension, (iii) application, (iv) analysis, (v) synthesis, and (vi) evaluation. After nearly four decades in 2001, the last 2 taxonomies were revised to (v) evaluation and (vi) creation (Fig. 41.3) [19].
MCQs designed to test knowledge (lower-level learning) would not be appropriate to test competence for objectives that reflect analysis (higher-level learning). Importance of skill and knowledge objectives should be provided in the educational programmes and measurable objectives should allow the assessment of achievement of the same [20].
6 What Needs to Be Done to Construct MCQs?
The first step for conducting MCQs is to have a blueprint also known as a test specification table. It is a guide that helps create a balanced examination and consists of a list of competence and topics that need to be tested. Three important contents of a good blueprint are:
-
1.
Content/objectives to be tested.
-
2.
Questions that design to test the content/objective.
-
3.
Learning domain and levels of testing.
It is a three-dimensional chart where the placement of each question and the content area is represented. It provides a solid foundation on which test activity is developed. It offers an evidence for content validity and also makes assessment more meaningful [21, 22].
MCQs need to have good grammar, appropriate punctuation, and avoid spelling errors. One must also minimize the time required to read each item. Basically, an MCQ consists of a stem or a lead-in question which is followed by 4–5 answers or options. The option which matches the key in a MCQ is best called ‘the correct answer’ and the other options are called the ‘distracters’. An ideal question is one that can be answered by 60–65% of the tested population. One must avoid unintended cues like making correct answers longer than the distracters. The instructions to answer these questions should be clear and uniform [23, 24].
A good distracter should be inferior to the correct answer but at the same time should also be plausible to a non-competent candidate. All options should contain facts and have a varying degree of acceptance. Only one answer should be correct and should match the examiner’s key. Commonly asked questions include the most appropriate, most common, least harmful, or any other feature which is at the uppermost or lowermost point in a range. The options need to be homogenous in both content and length. Options like always, never, completely, all of above, and none of the above should be avoided. Preparing appropriate distracters is challenging and needs a lot of effort [25].
Well-constructed MCQs aim at testing the application of medical knowledge (context-rich) rather than just the recall of information (context-free). Context-rich questions stimulate the thinking process, and represent the candidate’s problem-solving ability better than context-free questions. Practical problems encountered in clinical practice should be assessed rather than assessing knowledge of trivial facts or obscure problems rarely seen. MCQs should aim at making testing both fair and consequentially valid. MCQs should strategically evaluate important content and clinical competence [23, 26].
7 Can Objective Structured Clinical Examination (OSCE) and Objective Structured Practical Examinations (OSPE) Replace the Traditional Viva Voce?
Medical education has undergone a paradigm shift towards a more competency-based system and as a corollary, competency-based medical assessment. The Objective Structured Clinical Examination (OSCE) and its derivative the Objective Structured Practical Examination (OSPE) have been introduced as measures of competence, which avoid many biases associated with the conventional methods.
Harden et al. were the first to describe OSCE in 1975. OSCE /OSPE assesses clinical or practical competencies in a methodical, objective, and time-orientated manner with direct observation of the student’s performance during planned clinical or test stations. The third level of Miller’s pyramid ‘shows how’ is assessed. The student is evaluated on the performance of specific skill sets in a controlled setting [6, 27, 28].
The traditional examination focuses more on global performance rather than a student’s clinical competency. It mainly addresses the ‘knows’ and ‘knows how’ aspects of Miller’s pyramid of competence. Evaluation is often subjective, biased, monotonous, and inadequate in evaluating the overall performance of the student. Other attributes like attitude, communication skills, interpersonal skills, ethical issues, and professional judgements are not tested. Also, the need to understand core topics and develop problem-solving skills is not covered by traditional assessment methods. Another drawback is the variation of the examiner’s subjectivity which in turn reduces the reliability of the examination. It has been seen that subjectivity reduces the correlation coefficient between marks given to the same student to as low as 0.25. This affects scoring and results in dissatisfaction among both the examiners and examinees. Also, traditional methods lack a proper feedback process which is essential to improve one’s skills [7, 29].
8 How to Conduct OSCE and OSPE Assessments?
Conducting an OSCE or OSPE requires considerable effort and preparation. Here too the first step of planning is designing a blueprint of the structured checklist for observed and unobserved stations based on Bloom’s taxonomy. An examiners’ and students’ instruction manuals also should be considered while designing a blueprint. Checklists of clinical procedures, manuals, and standard answers need to be checked and validated by senior faculty members and medical educators [18].
Based on requirements the number of OSCE/OSPE station is decided. Apart from knowledge, the stations should also focus on evaluating communication, psychomotor, and clinical skills. The stations also need to be designed with difficulty levels ranging from ‘must know’ to ‘desirable to know’ to ‘nice to know’ sections. Stations can be either question and answer stations or procedure stations. Ideally, a procedure station should be followed by a question and answer station pertaining to the previous procedure. At procedure stations, students are expected to perform a focused history or examination on standardized patients. Other focused tasks like interpreting X-rays, electrocardiograms, and microscopic slides also can be evaluated. About 3–5 min are allotted for each station and it is recommended to have a few rest stations in between. Adequate time is given between stations to facilitate student movement [30].
Marking in OSCE/OSPE is relatively simple. Every examiner has a previously agreed-upon checklist of items with assigned points. The student is marked based on each piece of predetermined key information obtained or physical manoeuvre performed. A Likert-like scale ranging from 1 to 5 can also be used to grade overall efficacy. The final score is based on a compilation of marks obtained at different stations and the overall score [31].
Debates regarding the reliability and validity of OSCE/OSPE have been put to rest through multiple studies [31,32,33]. van der Vleuten and Swanson have recommended a few steps to improve reliability like using checklists, having standardized patients to maximize reproducibility, increasing hands-on skill stations, and maximizing testing time to 3–4 h [34].
The merits and demerits of objective testing using MCQs or OSCE/OSPE can be listed as follows [8]:
Merits
-
With comprehensive blueprinting cognitive, psychomotor domains, and high order thoughts can be effectively examined.
-
OSPE/OCSE help assess affective domain skills like history taking, communication.
-
Competence-based assessment.
-
Good teaching-learning tool with appropriate feedback.
-
Less experienced faculty members can be incorporated for assessment.
-
All the students are asked similar types of questions hence assessment is less biased.
Demerits
-
Blueprinting of the syllabus, validation of the comprehensive checklist is tedious and time consuming.
-
Administration and conduction of an MCQ-based examination or OSCE/OSPE is time consuming and laborious, money and resource intensive.
-
There is less interaction between the examiner and examinee.
-
Limited scope of questions.
-
Constant need to innovate and develop MCQ, OSPE, and OSPE banks to prevent repetition.
9 Conclusion: Where are We and Where Do We Need to Go?
-
The primary concern of medical education is clinical performance measurement which remains elusive.
-
Even though traditional assessment methods have been replaced by more objective evaluation systems like MCQs, OSCEs, and OSPEs with studies showing a significant correlation between the two, but a gold standard for such comparisons still does not exist.
-
Creation of a competency-based curriculum and appropriate tools to evaluate that curriculum is the need of the hour.
-
The literature supports the role of these objective modalities in the evaluation of knowledge, skill, and competency. One can conclude that combining objective assessment methods with traditional methods along with direct observation in the clinical setting has the potential to become the gold standard to measure a physician’s competence.
References
Wass V, Van der Vleuten C, Shatzer J, Jones R. Assessment of clinical competence. Lancet. 2001;357(9260):945–9.
Neufeld VR, Norman GR. Assessing clinical competence, vol. 7. New York: Springer; 1985.
Mennin SP, Kalishman S. Student assessment. Acad Med. 1998;73(9 Suppl):S46–54.
Swanson DB. A measurement framework for performance based tests. In: Hart IR, Harden RM, editors. Further developments in assessing clinical competence. Montreal: Can-Heal; 1987. p. 13–45.
Stalenhoef-Halling BF, van der Vleuten CPM, Jaspers TAM, Fiolet JFBM. The feasibility, acceptability and reliability of open-ended questions in a problem based learning curriculum. In: Bender W, Hiemstra RJ, Scherpbier AJJA, Zwiestra RP, editors. Teaching and assessing clinical competence. Groningen: Boekwerk; 1990. p. 1020–31.
Miller GE. The assessment of clinical skills/competence/performance. Acad Med. 1990;65:563–7.
Ananthakrishnan N. Objective structured clinical/practical examination (OSCE/OSPE). J Postgrad Med. 1993;39(2):82–4.
Wani P, Kini S, Dalvi V. Objective structured practical examination v/s traditional clinical examination in human physiology: faculty’s perception. IJBAP. 2012;1(1):30–5.
Al-Rukban MO. Guidelines for the construction of multiple choice questions tests. J Fam Community Med. 2006;3(3):125–33.
Case SM, Swanson DB. Constructing written test questions for the basic and clinical sciences. 3rd ed. Philadelphia: National Board of Medical Examiners; 2001.
Anderson J. Multiple-choice questions revisited. Med Teach. 2004;26(2):110–3.
Edward M. Multiple choice questions: their value as an assessment tool. 6, vol. 14. Lippincott Williams & Wilkins, Inc; 2001. p. 661–6.
Hammond EJ, Mclndoe AK, Sansome AJ, Spargo PM. Multiple-choice examinations: adopting an evidence-based approach to exam technique. Anesthesia. 1998;53(11):1105–8.
Sood R, Singh T. Assessment in medical education: evolving perspectives and contemporary trends. Natl Med J India. 2012;25(6):357–64.
Collins J. Writing multiple-choice questions for continuing medical education activities and self assessment modules. RadioGraphics. 2006;26(2):543–51.
Srinivasa DK, Adkoll BV. Multiple choice questions: how to construct and how to evaluate? Indian J Pediatr. 1989;56:69–74.
Salam A. Input, process, output: system approach in education to assure the quality and excellence in performance. Bangladesh J Med Sci. 2015;14(1):1–2.
Bloom BS, Engelhart MD, Committee of College and University Examiners. Taxonomy of educational objectives: the classification of educational goals. London: Longman; 1956.
Anderson LW, Krathwohl DR. A taxonomy for learning, teaching, and assessing. Abridged ed. Boston, MA: Allyn & Bacon; 2001.
Jamaludin R, Jaafar R, Kaur S. Training module series: student-centered learning (SCL), approaches for innovative teaching. Module 3: Learning Taxonomies. Centre for Development of Academic Excellence (CDAE). Universiti Sains Malaysia; 2012.
Patil SY, Hashilkar NK, Hungund BR. Blueprinting in assessment: how much is imprinted in our practice? J Educ Res Med Teach. 2014;2(1):4–6.
Adkoli B. Attributes of a good question paper. In: Sood R, editor. Assessment in medical education: trends and tools. New Delhi: KL Wig Center for Medical Education and Technology. AIIMS; 1995.
Mccoubrie P. Improving the fairness of multiple-choice questions: a literature review. Med Teach. 2004;26(8):709–12.
Chaudhary N, Bhatia BD, Mahato SK, Agrawal KK. Multiple choice questions-part II (classification, item preparation, analysis and banking). J Univers Coll Med Sci. 2014;2(3):54–9.
Tim Wood T, Cole G. Developing multiple choice questions for the RCPSC certification examinations. The Royal College of Physicians and Surgeons of Canada office of Education; 2001 Sep.
Schuwirth LW, Verheggen MM, van der Vleuten CP, Boshuizen HP, Dinant GJ. Do short cases elicit different thinking processes than factual knowledge questions do? Med Educ. 2001;35(4):348–56.
Harden RM, Stevenson M, Wilson DW, Wilson GM. Assessment of clinical competencies using objective structured clinical examination. Br Med J. 1975;5955(1):447–51.
Harden RM. What is an OSCE? Med Teach. 1988;10:19–22.
Verma M, Singh T. Experiences with objective structure clinical examination (OSCE) as a tool for formative evaluation in pediatrics. Indian Pediatr. 1993;30:699–702.
Carraccio C, Englander R. The objective structured clinical examination: a step in the direction of competency-based evaluation. Arch Pediatr Adolesc Med. 2000;154(7):736–41.
Cohen R, Reznick RK, Taylor BR, Provan J, Rothman A. Reliability and validity of the objective structured clinical examination in assessing surgical residents. Am J Surg. 1990;160:302–5.
Petrusa ER, Blackwell TA, Ainsworth MA. Reliability and validity of an objective structured clinical examination for assessing clinical performance of residents. Arch Intern Med. 1990;150:573–7.
Sloan DA, Donnelly MB, Schwartz RW, Strodel WE. The objective structured clinical examination: the new gold standard for evaluating postgraduate clinical performance. Ann Surg. 1995;222:735–42.
van der Vleuten CPM, Swanson DB. Assessment of clinical skills with standardized patients: state of the art. Teach Learn Med. 1990;2:58–76.
Author information
Authors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2022 The Author(s)
About this chapter
Cite this chapter
Nundy, S., Kakar, A., Bhutta, Z.A. (2022). Developing Learning Objectives and Evaluation: Multiple Choice Questions/Objective Structured Practical Examinations. In: How to Practice Academic Medicine and Publish from Developing Countries?. Springer, Singapore. https://doi.org/10.1007/978-981-16-5248-6_41
Download citation
DOI: https://doi.org/10.1007/978-981-16-5248-6_41
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-5247-9
Online ISBN: 978-981-16-5248-6
eBook Packages: MedicineMedicine (R0)