Introduction

Diagnostic errors are worldwide the most common, costly and severe error type in malpractice claim (Saber Tehrani et al., 2013). They can lead to severe consequences for patients, with studies showing moderate to severe damage up to 86,8% (Singh et al., 2013) and they more often contribute to patient death than other types of errors (Zwaan et al., 2010). Recent estimates reveal that approximately 5% of all patients presenting to primary care are subject to diagnostic error (Singh et al., 2014). Improving clinical reasoning education has been identified as an important way to reduce diagnostic error (Graber et al., 2018; National Academies of Sciences Engineering & Medicine, 2015). The need for a structured, longitudinal implementation of clinical reasoning education in medical schools in all specialties has been widely advocated in literature recently (Cooper et al., 2021; Kononowicz et al., 2020), as existing training programs may not provide adequate education related to diagnostic safety (Graber et al., 2018). Ideally, clinical reasoning education should be introduced in undergraduate (preclinical) medicine, starting with low-complex cases with typical presentations for developing basic medical knowledge. Throughout medical school and further into residency it should advance with more complex, rare or atypical case presentations that require a deeper level of reasoning and understanding, adapted to the level of expertise of the learner (Cooper et al., 2021; Leppink & Duvivier, 2016; Schmidt and Mamede 2015; Van Merriënboer & Sweller, 2010). This way, illness scripts (i.e., mental representations of the diseases) will be enriched over time, allowing for the learners to recognize both typical and atypical disease presentations, which contributes to better performance in diagnostic reasoning (Charlin et al., 2007; Eva et al., 1998; Lubarsky et al., 2015; Schmidt et al., 1990). Although all studies emphasize the importance of a careful choice of content for the case vignettes, there has been little attention in research on how to determine the most appropriate content of the cases from which students learn in clinical reasoning education. This is surprising, since there is increasing evidence that the content of the case determines what the students learn (Sherbino & Norman, 2014). Ideally, case content should reflect information that trainees need to know but have not yet mastered because of insufficient exposure (Norman, 2012). These knowledge gaps could entail various aspects of a clinical case, such as medical-technical content (atypical presentations, rare diseases), awareness of the risk of under- or overdiagnosis or the impact of the patient harm that may result from missing or delaying the diagnosis. In addition to these case-specific factors, there are many other contextual factors that influence diagnostic decision making, e.g. patient perspectives, the physician–patient relationship, the availability of diagnostic tests or support staff and the system or environment where care is rendered (Durning et al., 2011, 2020; Weiner & Schwartz, 2016). However, both case-specific—and contextual factors are not always reflected in the fictive, clinical case vignettes that are most frequently used in current clinical reasoning education. Therefore, it would be valuable to expand clinical reasoning education by using a larger variety of sources. Malpractice claims for example, reflect knowledge gaps as well as real-life contextual factors of a clinical case. In addition, those cases also reflect situations that impacted patients. By learning from malpractice claim cases, advanced students (i.e. residents) may learn from the mistakes of their peers (Fischer et al., 2006), thereby deriving educational benefit from diagnostic errors (Eva, 2009). However, not every claim on diagnostic error is suitable for clinical reasoning education. For example, diagnoses that are so rare that it is unlikely a doctor will encounter them during their career, or diagnoses that have a high risk of overdiagnosis, may be less suitable for more extensive clinical reasoning education. The aim of this study was to identify and prioritize, using priority criteria, conditions with the highest expected educational value for GP residents from our national malpractice claims database on diagnostic error for clinical reasoning education. This could be a way to complement the current clinical reasoning curriculum.

Methods

Study design

This study consisted of two parts. In the first part, experts in clinical reasoning education and diagnostic error defined five criteria that were considered important in determining the educational relevance of a medical condition. In the second part, three types of stakeholders in clinical reasoning education participated in ranking fifty unique missed diagnoses from a claims database on their educational relevance, using the five pre-defined priority criteria from the first part. Based on the scores, a top fifteen of the medical conditions with the highest educational value for clinical reasoning education was determined.

Setting

The current clinical reasoning curriculum of our post-graduate, three-year GP vocational training at Erasmus MC Rotterdam consists of eight themes that are distributed over all three years of training. Clinical reasoning education is embedded in learning sessions during regular weekly one-day educational sessions supervised by our teaching staff or during daily clinical practice supervised one-on-one by senior GPs. Each theme includes various fictive case vignettes on different diagnoses for trainees to solve. Case content varies from typical presentations to atypical, but generally leans more towards typical and common disease presentations.

Participants

Part 1: setting criteria for educational priority

For the first part of the study, we created a local body of experts to formulate educational priority criteria, existing of five senior GPs from our teaching staff involved in clinical reasoning education and one expert in the field of diagnostic error from the department of General Practice at Erasmus MC. Moreover, we asked four international experts on clinical reasoning education and diagnostic error as well as one representative of the liability insurance company and one representative of the Dutch Patient Federation for feedback on these criteria. The selection of the international experts from our network was based on their expertise in clinical reasoning education and/or diagnostic error.

Part 2 ranking the claims database with educational priority criteria

For the second part of the study, we included three types of stakeholders of medical education. Firstly, we included twenty-five third-year GPs in training. Being nearly graduated as a GP, residents have an important understanding of their own needs, knowledge and the current curriculum. Secondly, twenty-five daily supervisors of these trainees (senior GPs who supervise the residents in clinical practice in a one-on-one situation), who have a good impression of the practical knowledge gaps encountered by the trainees in daily practice were included. And lastly, nineteen lecturers of GP vocational training, who teach and/or supervise trainees during their weekly scientific sessions and have a good overview of the content of the current medical curriculum and the needs of the trainees were included.

Materials and procedure

Part 1: setting criteria for educational priority

Priority criteria and the methodology to score them do exist for research questions but not yet for educational content (Rudan et al., 2008). Based on the Child Health and Nutrition Research Initiative (CHNRI) methodology for assessing research questions, the local body of experts adapted the existing research criteria to five criteria for educational relevance during an in-person panel discussion, also considering the framework for measurement of diagnostic safety (Olson et al., 2018), to identify conditions prone to diagnostic error. Subsequently, independent feedback on these criteria was solicited from the international experts on clinical reasoning education and diagnostic error, a representative of the liability insurance company and a representative of the Dutch Patient Federation to refine the criteria. See Table 1 for the criteria.

Table 1 Criteria for setting priorities in clinical reasoning education content as formulated by a body of experts in clinical reasoning education

Part 2: ranking the claims database with educational priority criteria

Claims database

For this study, the liability insurance company covering 85% of the GP practices in the Netherlands made their claim database with cases filed between 2012 and 2017 available to us for educational purposes. The claim information was entered into a database and summarized by the insurance company, but the insurance company was not further involved in the analysis and interpretation of the study findings. The most important variables in the dataset were a description of the claim, a short summary of the response of the medical advisor, the liability outcome and the amount of indemnity paid (see Table 3 for a description of all available variables in the anonymous abstract of the claims database). We focused on the legitimate complaints (as judged by the insurance company) and on diagnostic error only. Between 2012 and 2017, eight hundred thirty-five (835) claims related to a diagnostic error against GPs were documented and closed. One hundred thirty-eight (138, 16.5%) of them were accepted as legitimate complaints for which indemnity was paid for a total amount of €4.7 million. Since it was not always immediately clear from the data which specific diagnosis was missed in a claim, two GPs (one study investigator and one independent lecturer from the department of GP) independently identified the specific missed diagnosis in each of the claims, using the description of the claim and the short summary of the medical advisor. When missed diagnoses did not become directly clear from the description or summary, or when there was no consensus between the GPs, the liability insurance company was asked to check the full record for the specific missed diagnosis. Disagreements were resolved by the GPs by discussion. Subsequently, the missed diagnoses were first ranked based on frequency of occurrence in the claims database and second on average amount of indemnity paid. This produced a list of a total of fifty (50) unique missed diagnoses (see Table 4 for the full list).

Ranking

All stakeholders received a link to an online Qualtrics questionnaire, which is a web-based survey tool. In the Qualtrics questionnaire, after giving informed consent, the participants were asked to score the educational relevance of the diagnoses of the claims database with the five priority criteria (see Table 1) on a five-point scale. Each diagnosis could be rated with 1 to 5 points for each criterion, where 1 is ‘not at all’ and 5 is ‘very much’. Instructions regarding the use of the criteria for scoring educational priority were given beforehand. Since scoring fifty diagnoses would take too much time per participant, each participant was randomly assigned to scoring 25 of the 50 diagnoses (i.e. either the odd or even numbers). For each diagnosis, we gave a short description of (a typical example of) the missed diagnosis from the claims database to clarify the main cause of the missed diagnosis (see Table 5 for examples).

Analysis

Part 2: ranking the claims database with educational priority criteria

Total priority score for each diagnosis could range from minimum 5 to maximum 25 points. The higher the score, the more relevant the disease was considered to be for education. Since formulation of criterion number five was paradoxical compared to the other four criteria (negatively formulated instead of positively), we reversed the score of this criterion first. We then calculated the mean score for each of the five criteria and the mean total priority score for each diagnosis using SPSS Statistics version 25 for Windows (IBM). We obtained a ranking of most relevant diseases for clinical reasoning education, of which we will present the top fifteen in the results section.

Results

Part 2: ranking the claims database with educational priority criteria

Participants

15 out of 25 invited (60%) GPs in training, 10 out of 25 invited (40%) supervisors of GPs in training and 12 out of 19 invited (63%) lecturers/teachers of the GP vocational training completed the questionnaire. In total, the ranking was based on 37 participants, 19 participants (51.4%) scored the odd cases and 18 (48.6%) the even cases.

Ranking list of most relevant diagnoses for education

In Table 2, the fifteen conditions that received the highest educational priority scores are presented (see Table 6 for the complete prioritized list). In the first column the original position of the diagnosis (based on frequency of occurrence and average amount of indemnity paid in the claims database) is presented, which shows that also diagnoses that had originally lower positions in the ranked claims database were considered educational priorities after the prioritization exercise. The overall mean total priority score for all fifty diagnoses was 17,11 (SD 1,31) and ranged from 13,89 to 19,61. The priority score of the top-fifteen missed diagnoses ranged from 18,17 to 19,61.

Table 2 Top-fifteen conditions from claim database that received the highest score on educational priority criteria from various stakeholders in medical education (e.g., trainees, supervisors, teachers of GP vocational training). Results include mean total priority scores based on the total of five priority criteria and mean scores per criterion

Discussion

The aim of this study was to identify and prioritize relevant educational content from a claims database of diagnostic errors in order to enrich clinical reasoning education in GP training. In this article, we presented the fifteen conditions from a Dutch claims database that received the highest score on newly formulated educational priority criteria from various stakeholders in medical education (i.e., trainees, supervisors, teachers) of the department of general practice at Erasmus MC, Rotterdam, The Netherlands.

Looking closely at the top-fifteen conditions, the conditions could be categorized in different groups with similar characteristics, for example complex common conditions, complex rare conditions and straightforward common conditions. Common but complex conditions are represented in the prioritized list by cardiovascular diseases (cerebrovascular accident, cardiopulmonary instability, myocardial infarction, arterial occlusion, coronary sclerosis and deep venous thrombosis), renal insufficiency and cancer. These conditions recur regularly in international malpractice claims lists. Although lower-priority conditions often involve diagnostic errors due to late referral or failure to order the appropriate diagnostic test, these conditions more often involve atypical disease presentations or complex contextual factors that allow alternative diagnoses to explain symptoms. For instance, a cerebrovascular accident was confused with alcohol intoxication in a known alcoholic, a myocardial infarction was mistaken for an exacerbation of COPD, or an arterial occlusion in a leg was preceded by trauma that provided a good explanation for the pain.

Another example of the need for a greater diversity of atypical case histories in education is the rarer diagnosis testicular torsion. While testicular torsion is covered in the curriculum, it ranks thirteenth in the educational priority list and a notable fourth place in the claims database is occupied by this condition. Although this is in line with findings in international literature (Colaco et al., 2015; Najaf-Zadeh et al., 2011; Osman & Collins, 2011; Raine, 2011; Ryan et al., 2020), it is remarkable since the incidence of testicular torsion in general practice is low (around 1:1800–4000 per year in children/adolescents till 25 years of age (Zhao et al., 2011)) and the differential diagnosis of testicular complaints is not very extensive. Moreover, the diagnostic examination for excluding a torsion (ultrasonography) is relatively easy to perform and not invasive. For these reasons one would not expect so many misdiagnoses of testicular torsion in the claim database and its subsequent high ranking on the educational priority list. Again for this disease, the descriptions of the claim cases commonly involve an atypical or complex presentation, e.g., non-acute abdominal pain (Pogorelić et al., 2013) or vomiting rather than acute severe pain in the testicle, as is currently usually taught to our students.

Other rare conditions that emerge from the priority list as urgent topics for clinical reasoning education are diseases that are complex by nature, to which physicians are insufficiently exposed in daily practice due to their low incidence, for example the diagnoses endocarditis, sinus thrombosis and extra-uterine pregnancy. More extensive exposure to these diseases during clinical reasoning education can facilitate their recognition and reduce diagnostic errors.

Not only complex or rare diseases, but also straightforward and common conditions appear on the shortlist, such as a tendon rupture/injury or eye infection. Here, physicians might be insufficiently aware of the risk and consequences of misdiagnosis for these relatively easy and common diagnoses. In a Swedish report on diagnostic errors, tendon rupture ranks fifth, after cancer, fractures, infections and heart disease in primary care and even second in the emergency department (Fernholm et al., 2019). For most GPs, the consequences of missing a tendon rupture are not as clear as for missing a myocardial infarction or cerebrovascular accident. Hence, more exposure to these conditions to create more awareness of what can go wrong, where the risks of misdiagnosis lie and what the possible serious consequences are, are considered to have added value for clinical reasoning education.

Besides the need for more attention to rare diseases and awareness of the consequences of missing a diagnosis, this priority list particularly emphasizes the need to include more atypical presentations or cases with complex contextual factors in the clinical reasoning curriculum for general practitioners in training so that experience can be gained with a wider range of examples. This may enrich the illness scripts (i.e., the mental representations of diseases) in physicians’ minds, which has proven to be an effective learning strategy in clinical reasoning education, especially for more advanced learners (Charlin et al., 2007; Eva et al., 1998; Lubarsky et al., 2015; Schmidt et al., 1990). In combination with other proven effective educational strategies to improve diagnostic performance such as deliberate reflection (Mamede et al., 2012, 2014; Prakash et al., 2019) and self-explanation (Chamberland et al., 2011, 2015), malpractice claim cases might provide a unique opportunity to optimize educational content, thereby potentially reducing diagnostic error and related patient-harm.

Strengths and limitations

As far as we know, this is the first study that systematically prioritized educational content from a malpractice claim database using newly formulated educational priority criteria. The claims database identifies specific conditions that are prone to diagnostic errors, and by definition have impact on patients. These prioritized diseases could be used to help selecting conditions to complement the clinical reasoning curriculum. In addition to this, the database presents real-life, practical examples of atypical presentations of a disease or cases with complex contextual factors, such as doctor-patient relationship and communication, patient perspectives and the system or environment where care is rendered. Practice with these cases could be especially beneficial for more experienced, post-graduate learners. However, adding too many conditions with atypical disease presentations could lead to more diagnostic testing to rule out those atypical presentations and therefore to (possible harmful) overtesting and overdiagnosis. It is therefore important to balance the use of malpractice claim cases in the curriculum with cases reflecting more typical and common disease presentations. This supports the argument that exposure to atypical disease presentations is more appropriate for advanced students, as for them it complements their experience in clinical practice where they see the more common and typical disease presentations.

Although the validity of the criteria that we formulated for this study was not extensively tested, we based them on the existing criteria used by CHNRI and Olson’s framework for measurement of diagnostic safety (Olson et al., 2018; Rudan et al., 2008). Moreover, we consulted various local and international experts on clinical reasoning education and diagnostic error, as well as GPs, a liability insurer representative and the Patient Federation. The criteria were formulated after extensive discussion and represent relevant aspects that are important in assessing educational relevance of a condition. The majority of the criteria, such as incidence, complexity of a condition, the risk for overdiagnosis and impact of a missed diagnosis on the patient are independent of location or educational curricula and therefore widely applicable. The participants that scored the diagnoses in this study represent the most important stakeholders of medical education of GP training, albeit all from one university centre. In other stages of medical education, other specialties and other university centres, educational curricula and consequent scoring of knowledge gaps might differ and therefore conditions might be prioritized differently. Developers of medical education could use the priority criteria to filter out the most relevant conditions for their setting, either provided by our claim list or from their local claims databases.

Despite our Dutch malpractice claim database not being a very large dataset in absolute numbers, it does reflect the majority (85%) of all claims filed against GPs in our country during a five-year-period. Moreover, the diseases recorded in our database do appear to be consistent with those of other larger claims databases of high-income countries worldwide (Fernholm et al., 2019; Wallace et al., 2013). Low- or middle-income countries may have different conditions represented in their local claim databases, for example due to different incidences of diseases or less availability of staff and diagnostic tests. However, atypical case presentations remain comparable worldwide, independent of time, setting and place, as it deals primarily with medical technical content and less with system- and management factors. Therefore, we expect that the diseases recurring in our priority list can be largely generalized to other (high-income) countries, and low- and middle-income countries could also use their local (claims) databases to define their own educational priorities with the formulated priority criteria.

Conclusion and recommendations

With the methods and results from this unique study, we can make specific suggestions to the developers of medical education on how to obtain a shortlist of conditions from (their local) malpractice claims that have high educational relevance because they are prone to error. It would be recommendable that educational developers get insight in the aggregated data of medical liability insurance companies and/or disciplinary boards to select relevant content for medical education. This allows not only for training with complex conditions that practitioners are insufficiently exposed to in daily practice due to their low incidence, but also for recognizing possible serious consequences of missing a relatively common and straightforward diagnosis. Additionally, for more expert learners, it might result in developing broader illness scripts for conditions that are usually basically covered in the curriculum in its typical form of presentation, but can present atypically or with complex contextual factors. This might contribute to gaining experience with a larger set of examples, which can help recognizing (atypical presentation of) these diseases in future patients, reducing the number of misdiagnoses and related patient harm. Furthermore, using non-fictive malpractice claim cases, encompassing all circumstantial and contextual factors that contribute to diagnostic error, as vignettes for clinical reasoning education might be a valuable method for further optimizing educational content, but this requires more research on its effectiveness.