Mental health is important for medical professionals (Brennan & Monson, 2014). However, especially medical residency is a demanding phase that puts resident physicians’ mental health at risk. Resident physicians are exposed to many work stressors including long working hours, excessive workload, documentation duties, great responsibility combined with restricted autonomy, scarcity of supervision, and a focus on economic aspects in medical decisions (Beerheide, 2017; Prins et al., 2007). Accordingly, diminished mental health is more prevalent in resident physicians compared to the general population including higher rates of burnout and depression (Dyrbye et al., 2014) as well as lower satisfaction with life (Tyssen et al., 2009). Consequently, there is a need to address physicians’ mental health during medical residency.

In general, there are two categories of interventions to improve resident physician’s mental health: (1) organization-focused interventions and (2) individual-focused interventions (Panagioti et al., 2017). Regarding individual-focused interventions, approaches that promote physician self-care are especially promising (Kuhn & Flanagan, 2017). The World Health Organization emphasizes self-care as a key strategy for health promotion and disease prevention (WHO, 2009) and self-care has been described as a professional imperative for physicians (Kuhn & Flanagan, 2017). Self-care can be described as an appreciative and loving attitude towards oneself, combined with respecting one’s needs and the active contribution towards one’s own well-being (Dahl, 2019). Specific self-care interventions and behaviors have scarcely been investigated by empirical research. However, mindfulness is closely linked to self-care and has been described as an important strategy in promoting self-care (Heidenreich & Michalak, 2020). Without mindful self-awareness, self-care behavior would not be possible (Dahl, 2019). Accordingly, it has been argued that mindfulness is the basis for and a catalyst to self-care (McGarrigle & Walsh, 2011). Mindfulness can be defined as present-moment-awareness that arises from intentionally directing one’s attention to the here and now with a non-judgmental and open attitude (Kabat-Zinn, 2003). The practice of mindfulness can be learned in structured programs. Literature demonstrates beneficial effects of such mindfulness-based programs (MBPs) on the mental health of physicians, including reduced perceived stress and burnout (Fendel et al., 2021b) and increased well-being (Scheepers et al., 2020). However, high-quality research on MBPs for resident physicians is scarce. As of yet, there have only been few randomized controlled trials, and their results are inconclusive (Ireland et al., 2017; Lebares et al., 2019; Verweij et al., 2018). There are several challenges to the effectiveness of MBPs for resident physicians. First, the culture of medicine promotes self-neglecting attitudes such as prioritizing work over personal needs and seeing needs as weakness (Irving et al., 2009; Wallace et al., 2009). At the same time, risk-increasing personality traits such as perfectionism and workaholism are prevalent in physicians (Wallace et al., 2009). These personality traits combined with the medical culture might lead resident physicians to functionalize an MBP as a means of increasing stress resistance and performance. Consequently, an MBP for resident physicians should focus on improving self-care and mental health, rather than emphasizing self-optimizing aspects such as stress reduction or efficiency since such an orientation might in turn raise pressure on the physicians. Second, it might be difficult to transfer the skills that have been learned in the quiet environment of the mindfulness course into the stressful work context of the clinic. A review of MBPs for physicians concluded that feasibility challenges of integrating mindfulness into work at the clinic should be addressed by MBPs for physicians (Scheepers et al., 2020). Third, lack of time is a major barrier to practicing mindfulness for resident physicians (Taylor et al., 2016). Accordingly, there is a need for MBPs that are tailored to resident physicians’ needs and that take these challenges into account (Lases et al., 2016; Scheepers et al., 2020).

Most previous studies investigating the effects of MBPs on resident physicians’ mental health have focused on psychopathology (i.e., mental health problems and disorders) or have included only a few aspects of positive mental health as a secondary outcome (Goldhagen et al., 2015; Ireland et al., 2017; Lebares et al., 2019; Verweij et al., 2018). Yet, good mental health is more than the absence of symptoms. For instance, Antonovsky’s salutogenic model of health can be seen as a paradigm shift from defining health as the absence of disease and a focus on factors that lead to disease to a focus on factors that promote health (Lindstrom & Eriksson, 2006). The World Health Organization (WHO) defines mental health as “a state of well-being in which an individual realizes his or her own abilities, can cope with the normal stresses of life, can work productively, and is able to make a contribution to his or her community” (WHO, 2004, p. 12). In this definition, well-being is key to mental health. It comprises both feeling well and satisfied with life (i.e., hedonic well-being) and positive functioning in life (i.e., eudaimonic well-being) (Keyes, 2005; Ryff, 2014). In the light of the literature that has evidenced resident physicians’ high degree of psychopathology (e.g., Dyrbye et al., 2014), it may sound like a contradiction to talk about their positive mental health. However, two large-scale population studies have shown that psychopathology and positive mental health can co-occur (Bergsma et al., 2011; Keyes, 2005). Accordingly, the two-continua model defines positive mental health and psychopathology as distinct dimensions (Keyes, 2005). This warrants an independent focus on positive mental health in exploring the effects of MBPs in resident physicians.

The aim of the present study was to assess the effects of an MBP that has been tailored to the needs of resident physicians in a longitudinal randomized controlled trial. Specifically, the aim was to assess and document the effects of the MBP on a wide range of dimensions of positive mental health in order to identify susceptible categories and relevant aspects.



To be included in the study, resident physicians needed to (1) be employed as a resident physician at a hospital, (2) have a work contract with a minimum employment of 40%, (3) be younger than 45 years, (4) have sufficient German language skills (i.e., because the MBP and the course book were in German), and (5) have given written informed consent. Recruitment ended as scheduled when the deadline for recruitment was reached. The final assessment took place in May 2020. Figure 1 displays the flow of participants. A total of 147 resident physicians from at least 25 hospitals were included in the final analysis. Table 1 gives an overview of the baseline characteristics of the participants. There were no differences in baseline characteristics between the two groups with the exception of implicit self-esteem, t(134) =  − 3.43, p < 0.0001.

Fig. 1
figure 1

Participant flow diagram

Table 1 Characteristics of participants at baseline

Assuming effect sizes of d = 0.45 (Khoury et al., 2015), a power of 80%, an alpha level of 0.05, and adding a customary dropout rate of 30% resulted in a planned sample of 178 participants. This initial sample size calculation was based on a simple group comparison, whereas the actual analysis was based on linear mixed modeling (LMM).


This study was a longitudinal randomized controlled trial (RCT) that compared a tailored MBP to an active control group. Assessments took place at four time points: at baseline (t0, 0 months), post-intervention (t1, 2 months), after the maintenance phase (t2, 6 months), and at follow-up (t3, 12 months). The study was pre-registered (trial registration number DRKS00014015), and a study protocol with a more detailed description of the methods of the RCT has been published (Aeschbach et al., 2020). Resident physicians were recruited by e-mail, flyers, a study Web page, short presentations at the respective institutions, and through head physicians between March 2018 and May 2019 in the south-west part of Germany. Resident physicians enrolled by e-mail or a study Web page. In the first on-site appointment, participants received further information about the study and gave their informed consent. Thereafter, participants took part in a baseline assessment and were allocated to either the intervention or the control group using the minimization procedure described below. Participants were able to receive points for continuous medical education (CME) for participating in the MBP, but no other incentives were given.


The intervention group took part in an MBP consisting of eight weekly sessions of 2.25 h each and one full-day silent retreat. During the maintenance phase, the topics and exercises of the program were refreshed through three monthly booster sessions. The program is based on Mindfulness-Based Stress Reduction (MBSR; Kabat-Zinn, 1990) and has been tailored to resident physicians’ needs. To tailor the program, we conducted an a priori needs assessment including extensive literature search and interviews with resident physicians. A detailed description of the program and the process of tailoring has been published elsewhere (Aeschbach et al., 2020). Furthermore, the whole program including information for trainers and participants is written down in a 204-page-long German-language manual. In the following, we only address the most important adaptations: (1) To prevent mindfulness from being functionalized as a means of increasing stress resistance and instead to promote mindfulness as a practice of self-care, we supplemented mindfulness by a focus on Muße. The term Muße is established in the German language but there is no direct translation into English. Muße can be described as a state of mind in which individuals feel at ease, content, fulfilled, and free from pressure, especially time pressure and the pressure to perform (Gouda et al., 2016). As such, Muße is incompatible with performance optimization and emphasizes mindfulness as a practice of self-care. (2) The MBP includes specific informal mindfulness practices that can be integrated into the work context (i.e., feeling one’s hands while disinfecting them or taking a deep breath before entering a patient’s room). (3) Resident-specific topics were addressed such as how to mindfully deal with specific stressors of medical residency or mindful communication with patients (see Supplementary Material for an overview of the topics covered in the program). (4) To take into account the high level of education of the target group, theoretical information and scientific background to program elements were provided.

On the structural level, every course session included five elements: (1) theoretical input; (2) formal mindfulness practice (sitting meditation, body scan, walking meditation); (3) reflection of ones’ practice and experience; (4) linking mindfulness to daily work-life; and (5) home assignments. The courses were organized into groups of up to 14 participants. To ensure quality of the intervention, the MBP was delivered by three physicians who were certified MBSR and MBCT trainers and who had extensive teaching experience. The three trainers were involved in the 18-month process of tailoring the MBP to resident physicians’ needs and developing a detailed curriculum guide.

The control group received a course book containing the same theoretical information that participants in the intervention group received. However, the course book did not contain practical exercises. This choice of control group allowed us to compare learning from experience as well as from description in the intervention group with learning from description only in the control group. Experience-based learning and description-based learning are two modes of learning (Hertwig et al., 2018). This distinction is relevant in the context of mindfulness since it has been suggested that personal experience is necessary to fully grasp mindfulness (Schmidt, 2014). Therefore, we assumed that the intervention group, which had a focus on experience-based learning, would benefit more than the control group, which engaged in description-based learning only.


Participants were allocated to either the intervention group or the control group using a minimization procedure to stratify for gender and level of burnout assessed with Copenhagen Burnout Inventory, which was the primary outcome as defined in the study protocol (Aeschbach et al., 2020) and is dealt with in another article (Fendel, et al., 2021a), with values 0–37.4 = low, 37.5–62.4 = medium, and 62.5–100 = high burnout. Within the minimization, we applied a base probability of allocation of 0.8 and variance as distance measure. Minimization was carried out with the software QMinim (Saghaei & Saghaei, 2011). To ensure allocation concealment, a researcher who had no contact with participants conducted the minimization procedure and group allocation after participants had completed all baseline assessments. Thereafter, the results of the group allocation were communicated to participants by email by a different researcher who was not involved in the allocation procedure.


Due to the nature of the study, neither the participants nor the teachers running the mindfulness courses could be blinded. To minimize bias, the self-report questionnaires as well as the implicit assessments were administered online using Unipark ESF Survey (QuestBack, 2019) and SoSci Survey (Leiner, 2016). The on-site assessment for GAS was conducted at t0 prior to randomization and at t1 by assessors who were blinded to group allocation. This was ensured through a clear division of tasks between researchers who collected data in person and researchers who managed and analyzed the data.


Since our broad objective of assessing positive mental health on various aspects cannot be collapsed into a single measure, we chose an array of measures of which none was defined as primary outcome. Thus, we explored a range of positive mental health variables using self-report measures, Goal Attainment Scaling (GAS), and indirect measures. All measurements were administered at all four time points except for the GAS, which was administered only at t0 and t1. Within the RCT, we also collected data on psychopathology which included the primary outcome of the complete trial (i.e., self-reported burnout) and mouse and keyboard usage. The data on psychopathology are reported elsewhere (Fendel, et al., 2021a).

Self-report Measures

Where available, we used the validated German versions of the questionnaires. All Cronbach’s alphas and McDonald’s omegas stem from the current sample at t0.

Positive affect was assessed using the valence dimension of the Self-Assessment Manikin (SAM; Bradley & Lang, 1994), consisting of five abstract faces depicting a semantic differential ranging from negative (coded as − 2) to positive (coded as 2).

Satisfaction with life was assessed by the validated German version of the one-item scale L1 (L1; Beierlein et al., 2015) that asks participants to rate how satisfied they are with their life using an 11-point numeric rating scale from 0 (not satisfied at all) to 10 (absolutely satisfied).

Self-compassion was assessed by the 12-item version of the Self-Compassion Scale that has been validated in German (SCS; Hupfeld & Ruffieux, 2011; Raes et al., 2011). The items were rated on a Likert scale from 1 (almost never) to 5 (almost always). To generate the overall sum score, the reversely scored items (i.e., self-judgment, isolation, over-identification) were recoded (Cronbach’s α = 0.85 and McDonald’s ω = 0.86).

Self-esteem was assessed by the Single-Item Self-Esteem Scale (SISE; Robins et al., 2001) that was scored on a scale from 1 (not very true for me) to 5 (very true for me). The scale has been translated into German by the authors.

Flourishing was measured using the validated German version of the Flourishing Scale (FS; Diener et al., 2009; Esch et al., 2013). Participants were required to rate eight statements about positive functioning in daily life on a Likert scale ranging from 1 (strongly disagree) to 7 (strongly agree) (Cronbach’s α = 0.88 and McDonald’s ω = 0.88).

Feeling loved by others and by oneself was assessed by the Feeling Loved Questionnaire (Barrett et al., 2019). The two dimensions consisted of a Yes/No question (i.e., each “yes” answer was scored with 100 and each “no” answer with 0) and an item consisting of a visual analogue scale with the anchors 0 (not at all) and 100 (very, very much). The scale was translated into German by the authors (Cronbach’s α = 0.72 and McDonald’s ω = 0.75).

Self-attributed mindfulness was measured using the Freiburg Mindfulness Inventory that has been validated in German (FMI; Walach et al., 2006). The FMI consists of 14 statements about aspects of mindfulness experienced in everyday life that were scored on a Likert scale from 1 (rarely) to 4 (almost always) (Cronbach’s α = 0.83 and McDonald’s ω = 0.84).

Muße was measured using a self-constructed but validated questionnaire (Heger, 2015). The questionnaire required participants to indicate how often they experienced Muße during the past 4 weeks on a Likert scale ranging from never (coded as 1) to very often (coded as 6). The questionnaire has moderate convergent validity with life satisfaction (i.e., as measured with the Questions of Life Satisfaction Modules, FLZM, r = 0.40; p < 0.001) and strong divergent validity with perceived stress (i.e., as measured with the Perceived Stress Questionnaire, PSQ, r =  − 0.55; p < 0.001).

Subjective perception of time pressure was assessed by an abridged version of the German version of the Time Perception Questionnaire (Jokic et al., 2018). To generate an abridged version, one of the authors of the original questionnaire selected five items, which he considered to be the most relevant. The statements were rated on a 5-point Likert scale from strongly disagree to strongly agree (Cronbach’s α = 0.83 and McDonald’s ω = 0.83).

Self-efficacy was assessed by the validated German version of the Self-Efficacy Questionnaire (ASKU; Beierlein et al., 2014). This scale consisted of three items that are rated on a Likert scale from 1 (not true at all) to 5 (very true) (Cronbach’s α = 0.85 and McDonald’s ω = 0.86).

Positive functioning at the workplace was assessed by the Thriving at Work Scale that has been validated in German (TS; Hildenbrand et al., 2018; Porath et al., 2012). The TS consists of ten statements that were scored on a Likert scale from 1 (not true at all) to 5 (absolutely true) (Cronbach’s α = 0.87 and McDonald’s ω = 0.87).

Job satisfaction was assessed by the Faces-Scale (Kunin, 1955). This scale consisted of seven abstract faces representing a semantic differential from negative (frowning face, coded as 1) to positive (happy face, coded as 7).

Goal Attainment Scaling

GAS was used to evaluate the intervention based on participants’ self-set goals. At baseline, participants defined the main goal they aspired to through their study participation. After the intervention at t1, participants rated to what extent they had achieved that goal using a 5-point scale from − 2 (much less than expected) to 2 (much more than expected) with 0 indicating as expected.

Indirect Measures

Indirect tests assess automatic processes that may lie outside a person’s control and stem from long-term personal experiences and associative processes (Greenwald & Banaji, 1995). As such, indirect tests provide a different level of investigating the effects of the MBP and are less susceptible to desirable responding than self-report measures (Uhlmann et al., 2012).

Attitude towards the job was assessed using the Single Category Implicit Association Test (SC-IAT; Karpinski & Steinman, 2006). In the SC-IAT, individual words appeared in the center of the screen for 1500 ms and participants in our study categorized them into “good,” “bad,” or “job as physician.” In one block, the categories “positive” and “job as physician” were paired and shared one key, and in a second block the categories “negative” and “job as physician” were paired. Reaction times were recorded. The SC-IAT consisted of two blocks with 96 trials each of which each block included 24 practice trials. The data processing procedure for all implicit measures is described in the Supplementary Material. Values higher than 0 indicate a positive attitude towards the job as physician, whereas values less than 0 indicate a negative attitude. The split-third reliability was r = 0.78.

Additionally, attitude towards the job was assessed using the Affect Misattribution Procedure (AMP; Payne et al., 2005). In the AMP, pictorial primes referring to the job as physician (e.g., a picture of a doctor, a stethoscope, an interaction with a patient) or neutral primes (i.e., squares of different shades of grey) were displayed for 150 ms. Each prime was followed by a neutral Chinese pictogram, which was displayed for 350 ms. Participants were instructed to rate how visually pleasant the Chinese pictograms were in comparison to the average Chinese pictogram. Answer options included “more pleasant than the average Chinese pictogram” or “less pleasant.” The AMP consisted of eight practice trials followed by two blocks of 40 trials each. Participants were given the cover story that the test assessed response behavior in the presence of distracting images (i.e., the primes referring to the job as physician and the squares of different shades of gray) and that they should focus on the Chinese pictograms. The AMP is based on the assumption that affective states influence the evaluation of ambiguous or undefined objects such as Chinese pictograms. Thus, it is assumed that the affect that is induced by seeing each of the pictorial primes is being misattributed to the subsequent Chinese pictogram (Payne et al., 2005). The final AMP score indicates the percentage of Chinese pictograms that have been rated pleasant following a prime referring to the job at the clinic. Internal consistency determined by the split-half method was r = 0.79.

Emotional state was assessed with an abridged version of the Implicit Positive and Negative Affect Test (IPANAT; Quirin et al., 2009). Participants rated to what extent three artificial nonsense words (SAFME, TUNBA, BELNI) expressed positive adjectives (happy, cheerful, and energetic) and negative adjectives (tense, inhibited, and helpless) (Quirin et al., 2009). Ratings were given on a 4-point Likert scale. The IPANAT yields two scores, one that is calculated from the positive adjectives and represents positive affect, and one that is calculated from the negative adjectives and represents negative affect (IPANAT-PA: Cronbach’s α = 0.49 and McDonald’s ω = 0.16; IPANAT-NA: Cronbach’s α = 0.58 and McDonald’s ω = 0.40).

Additionally, emotional state was measured with the Word Fragment Test (WFT). The German WFT used in this study was created by the authors and was based on related English versions of the WFT (Johnson et al., 2010; Rusting & Larsen, 1998). In the WFT, participants completed a series of word fragments that can be completed into either positive or negative words. For instance, “ange_” can be completed into “angel” or “anger.” Participants were given 3 min to complete as many words as possible. Scores range from 0 to 1 with higher values expressing a higher ratio of positive words and therefore representing more positive affect. Reliability was assessed by computing the test–retest reliabilities between t0 and the respective time point: t1 r = 0.58, t2 r = 0.57, t3 r = 0.62.

Self-esteem was assessed using the Name Liking Task (Gebauer et al., 2008). This task consists of one item asking participants to rate how much they like their full name on a scale from 0 (not at all) to 10 (very much). It is assumed that people with high self-esteem rate their name more positively compared to people with low self-esteem. The name-liking task has been translated into German by the authors.


We monitored participants’ course attendance, and at t3 participants rated their daily mindfulness practice.

Data Analyses

All analyses were performed on an intention-to-treat principle, that is, including all resident physicians who were randomized. The outcomes were analyzed using linear mixed modeling (LMM) including fixed effects for the group × time interaction, time, and group. Several models were compared based on model fit indices. To take inter-individual differences into account, we modeled a random intercept. We did not include a random slope since it did not increase model fit. Furthermore, we modeled an autoregressive residual covariance structure to take into account correlations that arise from repeated measures. The mindfulness courses were taught by three teachers and were structured into individual courses with a maximum of 14 participants each. Including the teacher variable as an additional level in the model did not improve model fit and was thus not included. Furthermore, we tested for homoscedasticity using Levene’s test and allowed for heterogeneous variances in cases where homoscedasticity was violated (see, e.g., Pinheiro & Bates, 2000). An overall model was calculated for the interaction of group × time that included all time points. Additionally, we looked at between-group differences from baseline to individual post time points. To this end, dummy-coded contrasts were calculated using the model estimates.

After assessing the pattern of missing values graphically as well as using statistical analysis, we determined data to be missing at random. In conducting linear mixed modeling, missing data is dealt with by using maximum likelihood estimates. Moreover, we replaced outliers (i.e., a deviation of 3 SD or more from the mean) with the highest or lowest value excluding the outliers (i.e., the Windsor method). As a result of this outlier correction, less than 1% of the data was modified. We calculated Cohen’s d as the adjusted mean difference in change from baseline between the two groups (i.e., the difference of differences) and divided it by the pooled standard deviation at baseline (Morris, 2008).

To analyze the GAS, the scores were transformed to T-scores (M = 50, SD = 10) (Kiresuk & Sherman, 1968). To compare GAS between the intervention and control groups, we conducted a t-test. All analyses were carried out using R version 4.0.4.

Since we did not have a primary outcome for the positive mental health part of the RCT, and explored the effectiveness of this novel mindfulness program on a range of positive mental health variables, we did not adjust the level of significance for multiple testing (see, e.g., Rothman, 1990). Hence, due to the ensuing higher risk of false-positive findings, any significant effect should be regarded as tentative and does not allow for confirmatory interpretation.


Adherence and Practice Time

Seventy-one participants (93%) attended four or more course sessions and were considered completers. Six participants attended less than four sessions. The mean attendance was 6.64 sessions (SD = 1.75). The mean attendance of the three booster sessions was 1.33 sessions (SD = 1.02). Fifty-nine participants provided data on practice time at t3. Of these, 56% reported not practicing any formal mindfulness exercises (i.e., body scan, sitting meditation), 36% reported practicing once a week, 5% reported practicing multiple times a week, and 3% reported practicing daily. Regarding informal mindfulness practice at the clinic, 30% indicated that they had not engaged in any informal exercises, 30% reported practicing informal exercises once a week, 31% several times a week, and 9% indicated practicing on a daily basis.

Self-report Measures

Table 2 displays means and standard deviations for the self-reported positive mental health variables by group across time points. The results of the linear mixed models are displayed in Table 3. From t0 to t1, there was a significant group × time interaction in terms of self-compassion, self-attributed mindfulness, flourishing, and Muße. From t0 to t2, there was a significant group × time interaction in terms of self-compassion, mindfulness, Muße, and thriving at work. From t0 to t3, we found a significant group × time interaction in terms of self-attributed mindfulness and flourishing. An overview of within-group effects can be found in the Supplementary Material.

Table 2 Self-report measures: means and standard deviations across the four time points by group
Table 3 Self-report measures: between-group effect estimates of the time × group interaction

Indirect Measures

Table 4 gives the means and standard deviations of the indirect measures. Conducting linear mixed modeling, we found no group × time interaction for any of the indirect measures at any time point except for the IPANAT-NA, which showed a significantly greater decrease in indirect negative affect in the intervention group as compared to the control group from t0 to t1 and from t0 to t3 (Table 5). Results of within-group comparisons can be found in the Supplementary Material.

Table 4 Indirect measures: means and standard deviations by group across time points
Table 5 Indirect measures: between-group effect estimates of the time × group interaction

Indirect and explicit measures at baseline did not significantly correlate (IPANAT-PA/SAM: r = 0.05, p = 0.5, and WFT/SAM: r = 0.12, p = 0.15). Similarly, there was no correlation between implicit and explicit measures of job satisfaction (SC-IAT/Kunin Faces Scale: r =  − 0.03, p = 0.70 and AMP/Kunin Faces Scale: r =  − 0.11, p = 0.98). There was a small correlation for implicit and explicit measures of self-esteem (r = 0.20, p = 0.02).

Goal Attainment Scaling

Categorization of participants’ self-set goals resulted in the following categories (every goal was allocated to one category only): 31% of resident physicians had formulated goals related to equanimity, calmness, and relaxation; 23% wanted to be better able to cope with stress or difficult situations; 15% set goals on the subjects of work-life balance or clearer work-life separation; 11% had formulated goals relating to the self, including self-awareness, self-care, self-confidence, or self-compassion; and 11% set goals about awareness, mindfulness, and being present in the moment. Participants in the intervention group were significantly better at achieving their self-set goals (M = 55.68, SD = 7.94) than participants in the control group (M = 43.68, SD = 8.10), t(129) =  − 8.56, p < 0.0001, d = 1.50 (95% CI = 1.11–1.89).


We investigated the effects of a tailored MBP on the positive mental health of resident physicians in a longitudinal randomized controlled trial. The core findings include the following: (1) There were small and medium improvements in the intervention group compared to the control group across various time points in terms of self-reported flourishing, self-compassion, self-attributed mindfulness, thriving at work, and Muße with effect sizes ranging between d = 0.25 and 0.88. (2) Goal Attainment Scaling revealed a greater achievement of participants’ self-set goals in the intervention group compared to the control group. This effect was large (d = 1.50). (3) There were no effects on any of the indirect measures with the exception of negative affect, where we found a larger decrease in negative affect in the intervention group compared to the control group.

Our results are by and large in accordance with previous studies on resident physicians. This includes the finding of a small effect on self-compassion (Verweij et al., 2018), a medium effect on self-attributed mindfulness (i.e., we found a medium effect compared to Verveij et al., who reported a small effect), and no effect on job satisfaction (Lases et al., 2016). The other variables on positive mental health had not yet been investigated in controlled studies with resident physicians. Most of our effect sizes were small to medium, and therefore smaller than the effect sizes reported for positive mental health in a meta-analysis by Eberth and Sedlmeier (2012), which summarizes studies that compared participants of MBPs to an inactive control group (i.e., r = 0.37 for well-being), and the effect size reported in a meta-analysis on mindfulness interventions at the workplace by Bartlett et al. (2019), which summarized RCTs that compared participants of MBPs to an active or inactive control group (Hedges’ g = 0.46 for well-being). Several explanations are possible for this discrepancy. First, we conducted a RCT with an active control group, while the meta-analyses included also non-randomized studies (Eberth & Sedlmeier, 2012) or inactive control groups (Bartlett et al., 2019; Eberth & Sedlmeier, 2012). Second, MBPs might be less effective with resident physicians than in the general population. That is, participation in an MBP might not be enough to compensate for the effects of the high job demands experienced by resident physicians.

Overall, we found positive between-group effects on some variables of positive mental health and failed to find effects on others. Several interpretations of this overarching outcome are possible: On the one hand, one might argue that these results are small in scale given the high time expenditure of taking part in such an MBP. On the other hand, one might argue that these results demonstrate that this MBP is an effective approach to improve some aspects of resident physicians’ positive mental health. An increase in self-compassion indicates that resident physicians adopted a kinder and more understanding attitude towards themselves. This increase in self-compassion combined with the finding of increased mindfulness may have a positive effect on resident physicians’ awareness of their own needs and encourage them to engage in self-care more often. Such a relationship was also present within our accompanying qualitative study (Aeschbach et al., 2021). As such, a mindfulness intervention may help to counter the deleterious effects of the medical workplace culture, which often promotes self-neglecting attitudes (Wallace et al., 2009). Furthermore, resident physicians perceived a more positive functioning in both their life in general (i.e., flourishing) and at work (i.e., thriving at work at T2). Additionally, the increase in perceived Muße indicates that participants in the intervention group were feeling at ease, fulfilled, and free from pressure more often. What is more, the subjective value of the MBP seems to be high. First, our results suggest that participating in this tailored MBP enabled participants to engage in informal mindfulness practices at work. Second, we found positive effects of the MBP on participants’ self-set goals which they aspired to by taking part in the MBP.

The trajectories across time points differed among variables. For instance, the between-group comparison showed an increase in self-compassion, self-attributed mindfulness, flourishing, Muße, and indirect negative affect right after the 8-week program. Conversely, for thriving at work, we found a between-group effect after 6 months but no effects immediately after the 8-week course. A possible explanation is that changes in inner states, including self-compassion, mindfulness, Muße, and affect, are achieved faster through an MBP, whereas changes in behavior such as thriving at work (i.e., positive functioning) might need more time. Most effects decreased over time. Therefore, future research could investigate longer MBPs and MBPs that include more maintenance sessions.

Moreover, we failed to find effects on several outcomes. A possible explanation is that the MBP focuses on individual factors. However, for improvements on certain positive mental health outcomes, organizational changes may be necessary. For instance, important antecedents to job satisfaction include job characteristics such as autonomy, climate, perceived support, and perceived fairness (Schleicher et al., 2011). These organizational factors are not improved through participation in an MBP. On the contrary, it has been suggested that mindfulness may increase awareness of one’s circumstances at work and if these circumstances are adverse, an MBP may not improve job satisfaction (Brooker et al., 2013).

In terms of indirect measures, we found no effects of the MBP compared to the control condition, except for negative affect, which decreased more in the intervention group compared to the control group. These findings are in accordance with our findings from the corresponding explicit measures. However, the results of corresponding indirect and explicit measures were uncorrelated. In general, evidence for indirect tests as a measure of mental constructs is weak (Van Dessel et al., in press). Accordingly, indirect measures are increasingly being criticized for their poor reliability and validity, and are considered to tap into different processes than explicit measures (Van Dessel et al., in press). This may explain the scarcity of correlation between the indirect and explicit measures in this study.

Limitations and Future Research

The present study has various limitations. First, we were unable to determine how this tailored MBP compares to standard MBSR programs since we did not include a third group that received a standard MBSR course. Second, we did not control for whether the control group read and engaged with the reading material that was sent to them by e-mail. Third, participants were a self-selected sample. Yet, this contributed to the ecological validity of this study by assessing the effectiveness of the MBP in those resident physicians who seek to attend such a program. Fourth, there is a risk of method bias due to assessing several constructs using self-report measures within the same study. Specifically, covariance between constructs may have resulted from using the same method of measurement rather than the construct itself. These spurious, method-based covariances could be due to response tendencies that are applied across several measures, similarities between items that entice participants to give similar responses, order effects, or social desirability (Podsakoff et al., 2012). Accordingly, the results of these measures should be interpreted with caution. Fifth, we used two questionnaires and one indirect measure that have not previously been validated in a German population (i.e., the SISE, the feeling loved questionnaire, the WFT, and the abridged version of the IPANAT). Additionally, the IPANAT showed low internal consistency. Therefore, the results of these instruments should be interpreted with caution. Sixth, we did not control for multiple testing. Therefore, there is an inflated risk of false positives. Accordingly, our results should be regarded as exploratory and do not allow for confirmative interpretation. Future research (e.g., studies that may be informed by our findings) are needed to confirm our results. Seventh, for some variables, we found significant within-group improvements in both groups, indicating an improvement over time independent of group allocation. We were unable to determine whether this effect was due to the control intervention being effective or alternative explanations such as resident physicians’ increased professional experience, maturation, parallel external events, or demand effects. Future studies should therefore compare this tailored MBP to a waitlist control condition. Eighth, engagement in formal mindfulness practice was fairly low (56% did not practice at all, 36% once a week, 5% multiple times a week, and only 3% daily). In contrast, engagement in informal mindfulness practice was higher (30% did not practice any informal exercises, 30% once a week, 31% several times a week, and 9% daily). Ninth, researcher allegiance towards MBPs is associated with larger effects (Goldberg & Tucker, 2020). Allegiance among researchers in the study at hand may have led to an overestimation of the positive health effects of the MBP. Last, we did not conduct any formal assessment of treatment fidelity. However, quality of the MBP was ensured through certified MBSR and MBCT teachers with extensive teaching experience as well as a detailed curriculum guide for teaching the MBP.

In sum, the results suggest that this tailored MBP can promote some aspects of resident physicians’ mental health and may provide an important resource during medical residency. However, due to the exploratory nature of this study, these results need to be confirmed in future research. Future studies should also assess the hypothesis that such a tailored program is more effective than generic MBPs and should also determine which effects are unique to the tailored program and which are not. Furthermore, our results suggest that some effects were maintained 1 year after the beginning of the intervention. Future studies should investigate to what extent booster sessions can contribute to maintaining the effects of MBPs. In addition, we investigated a broad range of positive mental health variables and treated them as largely independent of each other, ignoring that the effects of the MBP on these variables might be related. For instance, we found an increase in self-compassion and mindfulness right after the 2-month intervention, whereas we found an effect on thriving at work only after 6 months. A hypothesis to be tested in future research is that the effects of the MBP on thriving at work were mediated by an increase in self-compassion and mindfulness. Such relationships among outcome variables should be investigated.