We employed Nominal Group Technique (NGT), also called expert panel method, (Humphrey-Murto et al. 2017; Waggoner et al. 2016) and combined this with thematic analysis of expert panel discussions (Ho et al. 2011). In NGT, participants in a meeting share and discuss their perspectives on a certain concept and subsequently independently rank their ideas about this concept. NGT helps to reveal authentic expert opinion without any outside influence, since participants are knowledgeable representatives of the area of inquiry, have practical experience, and come from diverse settings. We selected NGT over other consensus methods (such as a Delphi technique) because it leads to generation of a larger number of ideas (Humphrey-Murto et al. 2017). Furthermore, as participants discuss these ideas among each other, each participant can establish their personal opinions about all introduced ideas based on interaction and discussion with colleagues with similar expertise. A strong facilitator, who should also be a recognized expert in the field, chairs the meeting, mitigating the potential for some participants to unduly dominate the group discussion. The ranking procedure in NGT ensures a democratic result, since final ranking takes place individually and privately.
Using expert panels allowed us to reach our specific aim of refining the pre-existing concept by combining the NGT procedure (leading to generating and ranking of ideas) and thematic analysis of the expert panel meetings (leading to development of a deeper understanding of the ideas). This procedure enhanced our understanding of concepts and terms used, and made it possible to interpret potential differences in contexts between the schools of the participants (Braun and Clarke 2006). Thus, we intended to triangulate quantitative and qualitative data from the expert panel meetings to describe a meaningful whole (Patton 1999).
This study was set up using a constructivist paradigm, in which knowledge is seen as actively constructed based on the lived experiences of participants and researchers alike, and co-created by them as the product of their interactions and relationships (Corbin and Strauss 2008). The implication of this choice was that our method had to allow for interaction and discourse between participants, researchers and the studied phenomenon, which led us to choose the NGT and thematic analysis methods, and combine these two (Varpio et al. 2017). Another implication of using the constructivist paradigm is that we must acknowledge that participants and researchers co-created the outcomes of this study: the final results originate from the interaction and discussion among participants and researchers about their shared knowledge and day-to-day experiences. To inform the readers about the knowledge and experience that the authors themselves brought into the study, we share the following with our readers: all authors are education researchers and/or medical educators experienced in teaching and guidance of professional behaviour of medical students. MM, WvM, GC and RAK are medical doctors, AT is an education researcher and AdlC is a linguist. The research question for this study was based on findings from our earlier research, as well as originated from our own teaching experience. To consider our own contribution to the interactive study process we kept an audit trail, which we regularly discussed with each other (Gilgun 2005).
Procedures and participants
Between October 2016 and January 2018, we collected quantitative and qualitative data through meetings with panels of experts from different medical schools in the Netherlands. In each school one expert panel meeting was organised with the help of a member of the national Special Interest Group on Professionalism of the Netherlands Association for Medical Education (NVMO). These members invited professional behaviour experts at their school, defined as medical educators who had been responsible for the assessment and/or remediation of students with unprofessional behaviour for at least three years. The member asked them if they could mention any other names of experts who would be eligible to participate, so called snowball sampling (Berg 1998). These individuals were additionally invited to participate. The NVMO member organized a meeting based on the availability of the experts. The participants were purposively sampled for their knowledge and practical experience, either or both in preclinical and clinical undergraduate medical education, to include a wide range of viewpoints and expertise perspectives from different settings. These experts had been in contact with students who behaved unprofessionally much more frequently than regular frontline teachers; the experts are confronted with a selection of students who have shown to behave unprofessionally. Thus, they had developed a specific experience in the guidance of such students. All participants agreed with the procedures, and final scheduling was based on availability. The sample size was not determined ahead of the study. We aimed for sufficiency of the data, meaning that the data should be rich enough to accomplish the aim of the study (Varpio et al. 2017). The sufficiency of the data was determined by reaching consensus in the full research team.
The expert panel meetings were facilitated by a team consisting of two of the researchers (MM, AdlC, WvM, GC and/or RAK), who performed the data collection process in four phases (Humphrey-Murto et al. 2017; Waggoner et al. 2016).
Phase 1: Each meeting started with a presentation of the three profiles of unprofessional behaviour as derived from our earlier research (see Fig. 1, and online appendix 1). Participants were not informed about the results of earlier expert panel meetings at other schools.
Phase 2: Participants were asked to independently and privately generate ideas in response to the following question: “What could we do to improve the profiles to enhance their usefulness for your work?” Each participant wrote down their individual ideas on several post-its.
Phase 3: In a Round Robin format, each individual idea for improvement was shared with the whole group by being read out. The ideas were discussed and clarified within the group, one at a time. All ideas were covered and similar ideas were clustered together into ‘group ideas’ on a flip-over chart. The facilitators ensured that all viewpoints were equally considered, all ideas were discussed and there was agreement about the clustering into group ideas.
Phase 4: The group ideas were given numbers and were written on a new flip-over sheet. Forms with five boxes were handed out so that each participant could write down the five ideas they deemed most important. The boxes were indicated by a five-point Likert type scale, where 5 points = most important and 1 point = least important. Each participant individually and independently (to ensure anonymity) ranked the group ideas into a personal top 5.
Before starting each meeting, participants were informed about the research protocol and ensured of confidentiality, after which their written consent was obtained. All meetings were audio-recorded and transcribed verbatim.
The group ideas and the ranking originating from each expert panel meeting represented the group consensus about refining the pre-existing profiles (see Fig. 1 and online appendix #1). MM and AdlC synthesized the group ideas from all five groups into final ideas, which were confirmed by the full research team. The ranking of the final ideas was established by adding up the rankings from all participants for each group idea, and presented as the percentage of all points.
Two researchers (AdlC and MM) performed thematic analysis of the qualitative data generated from the expert panel meetings (Braun and Clarke 2006), aiming to develop a model that encompassed the attributes nominated by the participants. Using ATLAS-ti, we initially independently coded two transcripts of the group debates in expert panel meetings in an open manner. After several cycles of reading, coding, and discussion, we established a final set of codes and themes. MM coded all transcripts using this set of codes, discussing any difficulties with AdlC. We used memos, diagrams and minutes of research meetings to collect ideas that occurred to us as we moved through the analytic process. By iteratively checking our findings, we ensured that conclusions were grounded in the data. The results were finalized through discussions in the full research team.
Developing the pre-existing profiles concept into a final model
Finally, AdlC and MM implemented the ten generated ideas into the pre-existing concept, closely paying attention to the results from the qualitative analysis of the debates. The complete research team discussed the final model, and reached full agreement on the results.
As a last step the analyses were presented to all participants for a final validation of the adaptations that were made to the pre-existing concept (Varpio et al. 2017). All participants were (by e-mail) asked to give their comments on the results of the study, including the ranking results, thematic analysis and the amended model.
Data sufficiency was reached after performing five expert panel meetings. These meetings took place at five different medical schools in the Netherlands; a total of 31 faculty participated, including 21 females and 10 males. The backgrounds of the participants were as follows: 9 medical specialists, 6 psychologists, 5 educationalists, 4 general practitioners, 2 registered nurses, 1 psychotherapist, 1 ethics specialist, 1 general physician, and 1 basic medical scientist. The participants had gathered their experience by teaching and assessing students’ professionalism as a frontline teacher for at least 5 years, and furthermore by having oversight over students’ professional development, or by being in remediation or a member of a (professionalism) progress committee for at least 3 years. Each group consisted of five to seven participants. The meetings lasted between 100 and 125 min.
Three types of primary results will be presented: (A) the NGT process ranking results, (B) the thematic analysis of the transcripts, (C) the development of the final model and (D) the validation of results by member checking.
NGT process ranking results
The five groups generated 162 individual ideas. After debating and ranking among the participants, only 37 of these ideas got at least one vote. Some of the 37 ideas were very similar, leading to a synthesis of the group ideas from different groups into ten final ideas. Combined, the three most prioritized final ideas received 60% of all points. See Fig. 2 for the idea generating process and ranking into final ideas.
The complete overview of generated individual and group ideas, ranking results, and final ideas can be found in online appendix #2.
Thematic analysis of the expert panel meetings
We found three main themes: (1) The profiles and the variable that distinguishes between profiles, (2) The dynamic nature of the profiles over time, and (3) Causal factors for the unprofessional behaviour. These three themes will be discussed below.
The profiles and the variable that distinguishes between profiles
In all expert panel meetings participants were generally content with the profiles. They recognized ‘real students’ in them. Participants described the pre-existing profile no reliability as ‘normal’ behaviour. Any student, and also any physician, can make an accidental mistake. That is normal, and not problematic if the student listens to feedback and wishes to learn from the mistake. Participants stressed that the professionalism problems that accidentally happen are not limited to reliability concerns, but can be presented by all kinds of unprofessional behaviours, also including disrespectful behaviour, lack of integrity and poor self-awareness.
According to the participants, pre-existing profile no reliability, no insight can be divided into behaviour that indicates reflectiveness, but lack of improvement, and behaviour that indicates improvement, without reflectiveness. This way, participants identified an extra behavioural profile in which students seem to display improvement in professional behaviour, without having insight in the way their behaviour relates to the fundamental values of professionalism as adopted by their institution. This behaviour is described as socially desirable: being professional at the right time, the right place, towards the right people. Participants state that it takes time to ultimately recognize this behaviour as unprofessional. They describe the behaviour as faking or gaming-the-system. They expressed that this behaviour is worrisome since it is not sustainable behaviour in more challenging circumstances.
Experts recognized the pre-existing profile no reliability, no insight, no adaptability: behaviour that indicates no reflectiveness and no improvement of the student over time. Sometimes, behaviours in this profile are so severe that they might threaten patient safety, which thus warrants a punitive approach, instead of a pedagogical, remediating approach.
The distinguishing variable between the profiles, the capacity to reflect and adaptability, is not seen as one combined variable but as two distinct variables. Adaptation can be seen with and without reflectiveness, and vice versa. Some students do not have the possibility to adapt, although their reflectiveness is apparent, e.g. when physical or mental health issues or family difficulties play a contributing role. Participants defined the term adaptability as the student’s willingness and ability to develop and improve over time. Reflectiveness was defined by participants not only as the ability to reflect on own behavior, but also as the willingness to do this.
The dynamic nature of the profiles
Participants stressed that students are not fixed in specific profiles, but the profiles form a time continuum, and student behaviour varies in different times and in changing contexts. This implies that students can move from one profile to another. It also has consequences for the process of diagnosing a profile: Frontline teachers need time to observe the student and to interact with the student to discover the right profile by observing how a student responds to feedback. Based on their perception at the end of their attachment they can ascertain the profile. Remediating faculty need assessments performed by different teachers in different contexts to get the full picture over a period of time. Although they indicated that they often can ‘diagnose’ a profile at once, they always use remediating activities, and the students response to these remediation activities was part of their diagnostic process in confirming the profile.
Causes for unprofessional behaviour
Unprofessional behaviour was attributed to personal circumstances, factors in the educational context and cultural differences.
Participants indicated that students’ personal constraints influence their professional behaviour. This includes the lack of competencies, such as communication skills or time management and organization skills. Furthermore, internal conditions, such as somatic or psychiatric illness of the student, or external circumstances, e.g. important life events or commitments outside the medical school can contribute to unprofessional behaviour.
Circumstances from the educational context
According to the experts, institutional aspects play a role in causing unprofessional behaviour. They mentioned that expectations for professional behaviour are not always made clear to both educators and students. Furthermore, the quality of the educators and the quality of the professionalism assessment method influence students’ professional behaviour. Also, an important factor is that students often experience the educational context as stressful.
Personal and professional values that form the basis for the assessment of professional behaviour differ according to culture, which makes the pre-existing concept difficult to apply to students with non-Western backgrounds. Differences of opinion about unprofessional behaviour, based on different cultural values, can lead to friction about actual behaviours in the workplace. Participants see such differences as difficult to overcome, since a student will not easily change internalized values originating from his or her upbringing. Especially the descriptions of behaviours do not seem to be applicable to non-Western students according to the experts.
In Table 1 the ten final ideas from the NGT-process are illustrated with quotes from the expert panel meetings.
Development of the final model
We incorporated the ten ideas to improve the profiles and the variables that distinguish between the profiles in the pre-existing concept, paying close attention to the results of the thematic analysis of the transcripts. These amendments are described in Table 2.
The highest ranked idea from the expert panel meetings was that reflectiveness and adaptability are two distinct distinguishing variables. This prompted us to devise a two-dimensional model of four profiles distinguished by two variables (see Fig. 3). The pre-existing profile no reliability is seen by our participants as normal behaviour, reflecting that unprofessional behaviour can accidentally happen. It is important that the student acknowledges the unprofessional behaviour, and demonstrates that he or she can learn from it. This profile is thus described as accidental behaviour in the final model. The pre-existing profile no reliability, no insight has been divided in two separate profiles. On the one hand, students’ behaviour that indicates a student’s insight without the possibility to adapt, in the final model described as struggling behaviour. On the other hand, students’ behaviour that show improvement, despite lacking insight, in the final model described as gaming-the-system behaviour. The pre-existing profile no reliability, no insight and no adaptability, describing a student displaying unprofessional behaviour without showing reflectiveness or adaptability over time, has not been changed. In the final model this profile is described as disavowing behaviour.
In the expert panel meetings attention has been given to causes for unprofessional behaviour. These ideas were among the lower ranked ideas to improve the pre-existing concept. The revised model does not depict these causes, as they can be equally relevant for any of the profiles.
Validation of the results by member checking
After establishing all results, a draft of the results section of the manuscript was sent to the 31 participants of the study. They were asked to individually review the draft, and specially pay attention to Fig. 3. Three participants had left their institution and could not be reached anymore. One participant was not able to review the manuscript due to time constraints. Twenty-five participants returned the email. All but one of them validated the revised model as depicted in Fig. 3. Twenty of them delivered additional remarks on the draft of the result section. These remarks featured (1) the way of indicating that the profiles are dynamic, (2) the interdependency of reflectiveness and adaptability, and (3) the text of the result section. All remarks were discussed in the full research team. Based on this discussion we decided not to alter Fig. 3. However, we incorporated the experts’ remarks in the results and discussion sections of the manuscript.