INTRODUCTION

Professional burnout is characterized by high levels of emotional exhaustion, cynical attitudes, and a diminished sense of personal accomplishment at work.1 A number of reviews have highlighted the problematic nature of burnout for a variety of healthcare professions, including medical residents,2 nurses,3 and mental health workers.4 , 5 Recent changes in healthcare delivery have also raised concerns that provider burnout may worsen if increased patient care and administrative demands outpace resources.6 Indeed, a recent national study found significant increases in burnout among physicians compared to the general population.7

Burnout is associated with a number of problems, not only for individual providers, but also for their employer organizations, patients, and the healthcare system as a whole. Workers with burnout often experience physical health problems (e.g., insomnia, headaches, poor overall health), relationship problems, reduced job satisfaction, and increased mental health problems (e.g., depression, anxiety, substance abuse).8 15 Burnout has also been negatively associated with organizational functioning, including excessive employee absences, tardiness, frequent breaks, reduced job commitment, and in some studies, poor job performance and increased turnover.11 , 13 , 16 , 17

Burnout can impact healthcare quality and safety in a number of ways. According to the job demands-resources model of burnout,18 21 job demands (e.g., interacting with patients with intensive service needs, balancing competing priorities) require effort over time and can result in costs to the healthcare provider (e.g., emotional exhaustion), particularly when resources are low. As providers become exhausted, they withdraw emotional energy from work, leading to depersonalization. This conservation of resources20 can also lead to providers spending less time with patients, and potentially becoming more directive than collaborative and patient-centered. Furthermore, burnout has been associated with cognitive impairments, including attention deficits,22 which can lead to errors. Williams and colleagues23 describe a cyclical model whereby burnout negatively affects the quality of the patient encounter, leading to dissatisfied patients, poor adherence, and worse health outcomes, which can cause additional provider burnout.

While links between burnout and quality of care have been theorized since the term burnout was introduced,24 research empirically linking burnout to quality of care has varied widely across healthcare specialties and types of quality domains (e.g., patient satisfaction, self-reported quality, errors). Some studies have reviewed aspects of these relationships, but none has attempted a comprehensive, quantitative review across disciplines and domains. For example, the relationship between burnout and job performance was summarized,25 but this meta-analysis included only 16 studies, was not restricted to healthcare settings, and included limited measures of performance. Within healthcare, Lee and colleagues26 studied correlates of burnout, but the sample was restricted to physicians, and few studies assessed quality. Another review27 took a broader approach, including a variety of healthcare professionals, but restricted studies to hospital settings, and because it was a narrative review, the authors were unable to quantify relationships or assess other contributing variables.

The objective of the current study was to systematically review and quantify empirical studies linking healthcare provider burnout to quality and safety in order to better understand the magnitude and consistency of these relationships. We explored potential moderators to examine whether the relationships would vary as a function of the aspect of burnout or quality being studied. Given the importance of multidisciplinary teams in patient-centered care,28 , 29 we included a variety of healthcare providers, and explored possible differences among provider types (e.g., nurses vs. physicians) and settings (inpatient vs. outpatient). We hypothesized that there would be negative relationships between each aspect of burnout (emotional exhaustion, depersonalization, and reduced personal accomplishment) and quality and safety. We further hypothesized that the effects would be largest for provider perceptions, compared to more objective indicators of quality (e.g., observation or medical records). Other relationships were considered exploratory.

METHODS

Data Sources and Searches

We followed guidelines provided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA).30 A systematic literature search of Ovid MEDLINE, PsycINFO, Web of Science, CINAHL, and ProQuest Dissertations & Theses was conducted to identify all studies involving health professionals, burnout, and quality of care through March 2015. The full electronic search string used for PsycINFO included the following: ((DE “occupational stress”) AND (SU “burnout”)) AND ((DE “quality of care”) OR (DE “quality of services”) OR (DE “satisfaction”) OR (DE “client satisfaction”) OR (DE “safety”) OR (DE “perceived quality”)). Similar search strategies were used for the other databases. All search strategies and the coding protocol are available from the authors.

Study Selection

We included published and unpublished studies of any design (e.g., cross-sectional surveys, intervention studies), as long as empirical data was used to assess the relationship between burnout and quality (including patient satisfaction) and/or safety; if these variables were assessed but the bivariate association was not reported, the authors were contacted to gather additional data for analyses. Attempts were made to contact 63 authors, of whom 21 provided usable data, 6 responded that data did not meet inclusion criteria or could not be obtained, 3 could not be located, and 33 did not respond. Although review articles were not included in our analyses, we examined reference sections of review articles to identify primary studies for inclusion.

We retained articles that specifically examined burnout. The Maslach31 three-dimensional scale of burnout (emotional exhaustion, depersonalization or cynicism, and reduced personal accomplishment) was used most often, although any study measuring at least one dimension of burnout or a global burnout score was included. We focused on healthcare providers and excluded studies of burnout in other occupations (e.g., education, probation officers, vocational rehabilitation). We categorized quality of care along two dimensions: perceived quality (rating scales or items reflecting provider’s perception, patient satisfaction) and safety (perceived safety, adverse events, “near misses,” medical errors). Included studies are briefly summarized in Table 1 of the supplemental online material.

Data Extraction and Quality Assessment

Articles were coded independently by a pair of coders (from a group of six coders comprising a clinical psychologist and five doctoral students). To maintain consistency and ensure reliability, coders met to review and come to consensus for each independent sample. We extracted information on burnout type and measure(s) used, quality and safety indicators, provider type (nurses, physicians, interdisciplinary), setting (outpatient, inpatient, or mixed inpatient/outpatient), and country (coded by region: North America, South America, Europe, Asia, Australia, Africa). Where available, we coded provider characteristics (age, gender, experience/length of time in the field) and patient characteristics (age, gender). We extracted information on potential methods-related moderators including study year, unit of analysis (individual, dyad, service unit, hospital/organization), and quality or safety data source (provider, patient, observer, medical records).

We rated the quality of each study to account for bias in individual studies (see Table 2 of the supplemental online material). Because quality rubrics commonly recommended for meta-analyses32 , 33 include items not relevant for correlational designs (e.g., blinding, allocation of intervention), we created items based on common sources of bias in observational studies.34 , 35 We tested and refined the initial rating system on several studies before rating the full sample. Two raters independently coded each study; disagreements were resolved through discussion. Measures of central tendency highlighted the presence of eight items as a potentially valuable cutpoint (mean = 8.12, median and mode = 8). Following other meta-analyses that examined subgroups based on quality ratings,36 38 we used quality rating as a moderator, examining the effect sizes for those with high quality (8 or above) compared to effect sizes of studies scoring below 8.

Data Synthesis and Analysis

We extracted effect size information at the level of burnout and quality (or safety) relationship. All associations were first converted into Pearson’s correlations; Fisher’s Z-transformation was conducted to adjust for the non-normal distribution of Pearson’s r. When a study reported multiple measures of the same construct, we averaged the effect sizes and weighted them by sample size in order to maintain statistical independence.39 We calculated an overall relationship, with one effect size per independent sample, to describe the relationships between burnout and perceived quality and safety. We conducted separate meta-analyses to examine the relationships aggregated at the level of predictor (burnout type) and aggregated at the level of the quality indicator (perceived quality and safety). We conducted moderator analyses for perceived quality and safety.

We used a random effects model to calculate the mean effect sizes using Comprehensive Meta-Analysis (CMA) software, version 2.40 At the aggregate level, Z-scores and confidence intervals were examined to determine the statistical significance of each association. The strength of the mean effect sizes were interpreted in light of Cohen’s41 recommendation for correlations, where 0.10 is small, 0.30 is medium, and 0.50 is large. We conducted one-study-removed sensitivity analyses to determine whether any single sample unduly influenced the results (indicating a potential outlier); because the point estimate of the mean effect size did not change substantially upon removal of any study, we performed the remainder of the analyses with the full sample.

We examined heterogeneity with the Q-statistic and the I 2 index; a significant Q-statistic informs whether moderation may be present, and the I 2 index informs the extent of the heterogeneity, ranging from 0 to 100 %, with higher values indicating greater heterogeneity.42 44 Although I 2 is of value in determining the need for moderation analyses, it does not speak to the source of heterogeneity or dispersion of effects.45 We used I 2 values of 25 % or more as a cutoff to examine the presence of moderators, as this suggests that between-study variability in effect size is greater than expected by chance.43 To document dispersion of effects, we report 95 % confidence intervals for each effect size.

We tested study-level moderators for both quality and safety, including year, type of report, provider type and setting, region, and quality of study. Additional moderators for burnout and perceived quality included burnout type, quality source, and unit of analysis. Additional moderator analysis for safety compared perceived safety (e.g., questionnaires) versus events (e.g., reported adverse events, near misses). For categorical moderators, we used an analysis of variance (ANOVA) analog. To test continuous moderator variables, we conducted random effects meta-regressions using unrestricted maximum likelihood estimation. Because meta-regressions use list-wise deletion, each moderator was examined independently to maximize the number of studies included in the analysis. Continuous moderators were considered significant if beta weights were significant and I 2 decreased. We interpreted statistical tests at p < 0.05. All moderator analyses were conducted in CMA, version 2.40

Finally, we assessed the potential influence of publication bias by examining funnel plots and testing for asymmetry using Egger’s46 regression approach. Although Egger’s test may be prone to bias in low-power situations, our sample size was well beyond the recommended minimum of ten samples.47 In addition, Failsafe N was not appropriate because of the high level of study heterogeneity and the random effects model used.39

RESULTS

The search yielded 1674 citations, resulting in 102 studies with 82 unique samples of healthcare providers (see Fig. 1 for PRISMA flow diagram30). A summary description of the included studies appears in Table 1, and a detailed table showing each study including individual effect sizes is presented in Table 1 of the supplemental online material. A total of 210,669 healthcare providers were included, from 32 countries on 6 continents. The majority of studies measured at least the emotional exhaustion component of burnout; about half studied depersonalization and reduced personal accomplishment. Some (19.5 %) included a total or global measure of burnout. Most assessed perceived quality, and about half assessed safety. Most studies took place in a setting that included both inpatient and outpatient care. Nurses were the most frequent target populations.

Fig. 1
figure 1

PRISMA flow chart of article identification and exclusion.

Table 1 Summary of Study Characteristics Across Independent Samples (k = 82)

The meta-analysis of the relationship between burnout and perceived quality including 63 independent samples resulted in a significant negative relationship (r = −0.26), with 95 % CI ranging from −0.29 to −0.23. The Q-statistic of the overall effect was significant, with a large amount of heterogeneity (I 2 = 93 %). A second meta-analysis was conducted between burnout and safety, including 40 independent samples, which also yielded a significant negative relationship (r = −0.23) and 95 % CI ranging from −0.28 to −0.17. The burnout–safety relationship also demonstrated high levels of heterogeneity (I 2 = 97 %). Forest plots for the meta-analyses of perceived quality and safety are shown in Tables 3 and 4 of the supplemental online material.

Moderator analyses for the relationship between burnout and perceived quality are shown in Table 2. Three categorical moderators were significant: type of burnout, unit of analysis, and source of quality rating. The relationship between burnout and perceived quality was strongest for emotional exhaustion (r = −0.27) or overall burnout (r = −0.25), whereas effects for depersonalization (r = −0.21) and reduced personal accomplishment (r = −0.20) were weaker but still significant. Effect sizes were significantly stronger when examining individuals (r = −0.27) compared to service units (r = −0.12). Effect sizes were significantly stronger for provider report (r = −0.28) compared with patient reports of quality (i.e., patient satisfaction, r = −0.17). No continuous moderators were significant predictors.

Table 2 Moderator Analyses of the Relationship Between Professional Burnout and Quality

Moderators for the safety meta-analyses are shown in Table 3. None of the continuous moderators were significant, but three categorical moderators were identified: safety indicator, study population, and the country in which the study was conducted. Though both types of safety indicators were statistically significant, burnout had a stronger relationship with perceptions of safety (r = −0.28) than events (r = −0.16). In terms of discipline, the strongest relationship was for nurses (r = −0.27), followed by interdisciplinary samples (r = −0.24) and physicians (r = −0.15). Regarding country, effect sizes were stronger in studies from Europe (r = −0.36) than in those from North America (r = −0.18). Discipline, however, was confounded with location. Most European studies with safety outcomes consisted of nursing samples (78 %), whereas only 26 % of North American studies focused on nursing samples. Follow-up analyses to parse the overlap in these results indicated that in North American studies, nurses (r = −0.22) and physicians (r = −0.15) did not differ in relationships between burnout and safety (Q = 0.59, df = 1, p = 0.44). There were not enough studies of physicians within the European sample to conduct a parallel analysis.

Table 3 Moderator Analyses of the Relationship Between Professional Burnout and Safety

We created a funnel plot, displaying overall study effect as a function of precision (calculated as 1/SE; see Figures 1 and 2 in supplemental online material). Egger’s test of the intercept was not significant for either perceived quality (t = 1.13, p = 0.26) or safety (t = 0.72, p = 0.48). As both meta-analyses had substantially more than the recommended minimum number of ten samples,47 , 48 the lack of significance suggests that publication bias did not influence these findings.

DISCUSSION

Summarizing all available empirical literature on healthcare provider burnout, we found small to medium-sized relationships between burnout and both decreased quality of care and decreased safety. For perceived quality, the effect size of r = −0.26 translates into approximately 7 % of variance accounted for by provider burnout. For safety, the effect size of r = −0.23 translates into approximately 5 % of the variance in safety being attributable to provider burnout. These relationships were robust to potential publication bias and ratings of study rigor, which increases confidence in the findings. Given the increasing rates of burnout, particularly among physicians,7 these findings could have important ramifications.

Of the burnout components, emotional exhaustion had the strongest relationship with quality, followed by depersonalization and reduced personal accomplishment. These findings parallel a meta-analysis of burnout and objective job performance in other service industries.25 Together, these findings suggest that emotional exhaustion may be the most critical element of burnout to address. Earlier conceptualizations of burnout also posit a primary role for emotional exhaustion—that it may be the driving element that leads to other aspects of burnout.49

In terms of quality, burnout had a medium-sized relationship with lower perceived (provider-reported) quality and a weaker, but still significant, relationship with reduced patient satisfaction. This is important, given the increasing role of satisfaction as a benchmark for performance evaluations. Medicare payments are now tied to performance outcomes, which began with, and still include, patient satisfaction.50 In addition, satisfaction has important implications for retaining patients in competitive markets and for engaging patients in good self-care strategies, both of which could impact patient outcomes over the long term. For example, relationship variables such as working alliance51 and perceptions of patient-centered care52 , 53 have important associations with improved patient outcomes. In an era of increasing attention to collaborative care, burnout may be interfering with the provision of optimal care.

The relationship between burnout and safety risk is of particular concern. Similar to quality, the relationships were stronger for self-reported perceptions of safety (all provider-reported), yet a significant negative correlation (−0.18) was still found for safety events (i.e., errors and adverse events), some of which included external sources of data such as medical records or observer ratings. Although the effect size translates into about 3 % of the variance, these findings represent an instance where small statistical effects are still noteworthy;54 provider burnout may contribute in part to real-world outcomes for patients, putting them at higher risk of an error or adverse event. In addition to the obvious implications for patient health and well-being, a greater number of errors also leads to greater liability for healthcare organizations.

Moderator analyses for safety suggest that the impact of burnout might be greater for nurses than for physicians or mixed provider samples. This finding is consistent with research showing that nursing care is more predictive of patient ratings of quality than physician care.55 As noted by Leiter and colleagues56 in one of the first studies linking nurse burnout with patient satisfaction, nurses have more direct patient contact and perform more of the daily care activities than other healthcare professionals such as physicians, which may have a more central impact on patients. Alternatively, there may be a reporting bias, as nurses have been shown to report more minor errors and safety events than physicians.57 However, subsequent analyses suggest that geographic location might be the driving factor. European samples, which were predominantly based on studies of nurses, showed stronger relationships between burnout and safety than those conducted in North America. It is possible that systemic differences across countries in how healthcare services are provided or experienced could affect the relationship between provider burnout and safety. For example, Aiken and colleagues58 found wide variability in nurse- and patient-reported safety and quality across 12 countries, and those samples were included in our study. More research is needed to clarify which explanation is most accurate.

As a meta-analysis, this study is limited by some of the same constraints as the primary studies examined. Studies were predominantly cross-sectional. Thus, while the consistent negative relationships between burnout and both quality and safety indicators suggest that provider burnout may be negatively impacting healthcare, we cannot rule out the possibility that poor quality itself could contribute to increased burnout, or that other factors (e.g., negative organizational culture) may be causing both burnout and poor quality. Cross-sectional studies may also limit our ability to detect delayed consequences of burnout, such as a gradual erosion of quality or safety over time, or the impact of burnout on staff turnover that subsequently impacts quality and safety in clinical settings. Studies were predominantly based on self-reports, and few used objective indicators of quality. At the same time, the findings regarding patient satisfaction and some studies of safety events did include external sources of data. The fact that these were still significantly related to burnout suggests that the relationships go beyond common method variance. Furthermore, the majority of studies were at the level of the individual provider, despite the fact that team-based care models are the norm. More research is needed on the impact of burnout within a team. For example, it is possible that the impact of one team member with burnout on the total delivery of care would not be enough to affect outcomes; alternatively, burnout may be subject to a “contagion effect,”59 in which one or a few burned-out team members could negatively impact the whole team. Finally, meta-analyses were characterized by high levels of heterogeneity; in most moderator analyses, I 2 values remained above 25 %, indicating likely additional moderators. Future studies should continue to investigate relationships between burnout and healthcare quality and safety, including potential moderators not examined here. Similarly, the overall effects accounted for less than 10 % of the variance in quality and safety. Clearly, provider burnout is not the primary factor in these outcomes. However, our findings do suggest a consistent role, perhaps as an important contextual variable, in addition to other predictors of quality and safety (e.g., organizational policies, staffing ratios, communication).

Despite limitations, this study has several notable strengths. It is the first to systematically, quantitatively analyze the links between healthcare provider burnout and healthcare quality and safety across disciplines. We aggregated findings from a large number of independent samples; analyses suggested that publication bias and study rigor were not important factors in our findings. In addition, the studies spanned 6 continents and 33 countries, enhancing the generalizability of the results. Taken together, our findings suggest that healthcare provider burnout is an important area for future research and a target for intervention that could have benefits for providers, patients, and the healthcare system at large.