Introduction

Clinical social workers who work with traumatized populations often must share the emotional burden of their clients in order to facilitate the healing process (Herman, 1992). In so doing they bear witness to damaging and cruel past events, coming face to face with the reality of terrible and traumatic events in the world (Kassam-Adams, 1999; Pearlman & Saakvitne, 1995). Confrontation of such facts may result in the shattering of clinicians’ assumptions of invulnerability, the world as meaningful, and positive self-perceptions (Janoff-Bulman, 1989). Effective trauma treatment often involves assisting the individual to work through the traumatic experience, a process in which the client repeatedly recalls memories of the event in order to bring closure to the experience. Through this process, the clinician is often repeatedly exposed to traumatic events through vivid imagery. It is now widely recognized that the indirect exposure to trauma involves an inherent risk of significant emotional, cognitive, and behavioral changes in the clinician. This phenomenon, variously referred to as vicarious traumatization (VT), secondary traumatic stress (STS), and compassion fatigue (CF), is now viewed as an occupational hazard of clinical work that addresses psychological trauma; a view supported by a growing body of empirical research (i.e., Adams, Boscarino, & Figley, 2006; Bride, 2004, 2007).

First explicated by McCann and Pearlman (1990), vicarious traumatization refers to a transformation in cognitive schemas and belief systems resulting from empathic engagement with clients’ traumatic experiences that may result in “significant disruptions in one’s sense of meaning, connection, identity, and world view, as well as in one’s affect tolerance, psychological needs, beliefs about self and other, interpersonal relationships, and sensory memory” (Pearlman & Saakvitne, 1995, p. 151). Figley (1995) defines secondary traumatic stress as “the natural and consequent behaviors and emotions resulting from knowing about a traumatizing event experienced by a significant other—the stress resulting from helping or wanting to help a traumatized or suffering person” (p. 7). With the exception that the traumatic exposure is indirect, secondary traumatic stress is nearly identical to posttraumatic stress including symptoms associated with posttraumatic stress disorder (PTSD) such as intrusive imagery, avoidance, hyperarousal, distressing emotions, cognitive changes, and functional impairment (Figley, 1995, 2002; Figley & Roop 2006). Figley (1995, 1996, 2002) has also introduced compassion fatigue as a more “user-friendly” term to describe the phenomena of secondary traumatic stress. Though there are some distinctions between vicarious traumatization and secondary traumatic stress/compassion fatigue in terms of theoretical origin and symptom foci, all three terms refer to the negative impact of clinical work with traumatized clients. As such, henceforth we will use the term compassion fatigue to refer to the negative effects on clinicians due to work with traumatized clients- except where a cited author has a clear preference in terminology.

Despite evidence that some clinicians experience compassion fatigue, many clinicians do not. Many who do continue to be committed to the work. It follows that there is some positive aspect of trauma work that sustains and nourishes clinicians. Many clinicians are motivated by a sense of satisfaction derived from helping others—an experience labeled compassion satisfaction (Stamm, 2002). The relationship between compassion fatigue and compassion satisfaction is not yet clear, although Stamm (2002) has suggested that there is a balance between the two experiences. That is, a clinician may experience both compassion fatigue and compassion satisfaction simultaneously, though as compassion fatigue increases it may overwhelm the clinician’s ability to experience compassion satisfaction.

In addition to reducing the satisfaction of clinical work, the effects of compassion fatigue are believed to impair the ability of clinicians to effectively help those seeking their services (Figley, 1996, 1999). Clinical social workers experiencing compassion fatigue are believed to be at higher risk to make poor professional judgments such as misdiagnosis, poor treatment planning, or abuse of clients than those not experiencing compassion fatigue (Rudolph, Stamm, & Stamm, 1997). The first step in preventing or ameliorating compassion fatigue is to recognize the signs and symptoms of its emergence. By continually monitoring themselves for the presence of symptoms, clinical social workers may be able to prevent the more negative aspects of compassion fatigue.

Several standardized measurement instruments have been developed specifically to assess different aspects of compassion fatigue, and several other standardized instruments that were developed to measure direct trauma reactions have been used in the study of compassion fatigue. The purpose of this article is to provide an overview of these instruments so that clinical social workers may make informed decisions regarding how to monitor their own experiences of compassion fatigue.

Compassion Fatigue Instruments

The following section reviews information on the various measurement instruments that have been utilized to assess compassion fatigue. Each instrument that has been included in this review would be appropriate for use by clinicians who provide services to a wide variety of traumatized clients, regardless of the trauma experienced (i.e., physical or sexual victimization, violent crime, community violence, disaster, combat, terrorism, etc.).

Compassion Fatigue Self Test (CFST), Compassion Satisfaction and Fatigue Test (CSFT), and Compassion Fatigue Scale (CFS)

The CFST (Figley, 1995) with its different versions is perhaps the most commonly used instrument to measure compassion fatigue, in part because it was one of the first measures developed specifically for this purpose. The CFST was originally developed based on clinical experience and designed to assess both compassion fatigue and job burnout. The original CFST has 40 items divided between two subscales: compassion fatigue (23 items) and burnout (17 items). The instructions ask respondents to indicate how frequently (1 = rarely/never, 2 = at times, 3 = not sure, 4 = often, 5 = very often) a particular characteristic is true about themselves or their situation. On the compassion fatigue subscale, scores of 26 or below indicate extremely low risk, scores between 27 and 30 indicate low risk, scores between 31 and 35 indicate moderate risk, scores between 36 and 40 indicate high risk, and scores of 41 or more indicate extremely high risk of compassion fatigue (Figley, 1995). On the burnout subscale, scores of 36 or below indicate extremely low risk, scores between 37 and 50 indicate moderate risk, scores between 51 and 75 indicate high risk, and scores between 76 and 85 indicate extremely high risk of burnout. The process by which score ranges were derived is not found in the published literature. Reported internal consistency alphas range from .86 to .94 and factor analysis suggests one stable factor reflecting depressed mood in relationship to work accompanied by feelings of fatigue, disillusionment, and worthlessness (Figley, 1995; Figley & Stamm, 1996).

Stamm and Figley (1996) more fully developed the CFST with the addition of a series of positively oriented questions paralleling the negative orientation of the compassion fatigue items, resulting in a 66-item instrument. The addition of positively oriented items was intended to measure compassion satisfaction. Pilot work on this revised version of the CFST was conducted and provided good evidence of reliability with internal consistency alphas of the three subscales as follows (Table 1): compassion satisfaction (.87), burnout (.90), and compassion fatigue (.87) (Stamm, 2002). Continued development of this version of the CFST has resulted in a renamed instrument, the Professional Quality of Life Scale (ProQOL) which is more fully discussed below.

Table 1 Characteristics of compassion fatigue assessment instruments

Gentry, Baronowsky, and Dunning (2002) report using a different version of the CFST, which they call the Compassion Fatigue Scale – Revised (CFS-R). This version is comprised of 30 items, 22 of which measure compassion fatigue and 8 of which measure burnout. Respondents are asked to use a 10-point scale to indicate how frequently each item is true for them and a revised scoring scheme. Gentry et al. (2002) did not report reliability or validity information. More recently, however, Adams et al. (2006) conducted a psychometric study of the CFS-R, which identified multiple underlying factors, calling into question the factor validity of the CFS-R. As such, they made data-driven refinements to the instrument, resulting in a revised instrument, which they refer to as the Compassion Fatigue-Short Scale (CF-Short Scale; Adams et al., 2006). The CF-Short Scale is a 13-item measure that is comprised of an 8-item burnout subscale and a 5-item secondary trauma subscale. Internal consistency estimates were as follows: .90 for the Burnout subscale, .80 for the Secondary Trauma subscale, and .90 for the combined scale. In addition, Adams et al. (2006) present convincing evidence for factor, concurrent, and predictive validity of the CF-Short Scale.

Professional Quality of Life Scale (ProQOL)

As noted above, the ProQOL (Stamm, 2005) is a revision of Figley’s (1995) Compassion Fatigue Self Test and is composed of three discrete subscales. The first subscale measures compassion satisfaction, defined as the pleasure derived from being able to do one’s work (helping others) well. Higher scores on this subscale represent greater satisfaction related to one’s ability to be an effective caregiver. The second subscale measures burnout, or feelings of hopelessness and difficulties in dealing with work or in doing one’s job effectively. Higher scores on this subscale represent a greater risk for burnout. The third subscale measures compassion fatigue/secondary traumatic stress, with higher scores representing greater levels of compassion fatigue/secondary traumatic stress.

The ProQOL is structured as a 30-item self-report measure in which respondents are instructed to indicate how frequently each item was experienced in the previous 30 days. Each item is anchored by a 6-item Likert scale (0 = never, 1 = rarely, 2 = a few times, 3 = somewhat often, 4 = often, and 5 = very often). Scoring requires summing the item responses for each 10-item subscale. A total of 5 items (1, 4, 15, 17, and 29) must be reverse scored prior to computing scores. The subscale scores cannot be combined to compute a total score. The most current scoring guidelines (Stamm, 2005) are based on a conservative quartile method whereby cut scores are based on the 75th percentile. As such, the guidelines suggest that a score of 33 or below on the compassion satisfaction scale may suggest job dissatisfaction. Guidelines for the burnout scale suggest that a score below 18 reflects positive feelings about one’s ability to be effective in one’s work, and scores above 27 may be cause for concern in that one may not feel effective. Regarding the compassion fatigue/secondary trauma scale, scores above 17 should be considered to reflect a potential problem in this domain.

Internal consistency reliability estimates for the subscales are reported as .87 for the compassion satisfaction scale, .72 for the burnout scale, and .80 for the compassion fatigue/secondary trauma scale. Stamm (2005) reports that a multi-trait, multi-method approach to convergent and discriminant validity supports the discriminant validity of the ProQOL suggesting that the subscales measure different constructs. Stamm (2005) does not note whether convergent validity was supported. The data supporting the validity of the ProQOL have not as yet been published or made publicly available and therefore cannot be assessed. Factor validity studies have not been published.

Secondary Traumatic Stress Scale (STSS)

The STSS (Bride, Robinson, Yegidis, & Figley, 2004) was designed to assess the frequency of intrusion, avoidance, and arousal symptoms associated with indirect exposure to traumatic events through clinical work with traumatized populations. The STSS was developed consistent with Figley’s (1995, 1999) definition of secondary traumatic stress as a syndrome of symptoms nearly identical to those of posttraumatic stress disorder (PTSD). Each of the 17 items was designed to tap one of the DSM-IV-TR (APA, 2000) criteria for PTSD. Respondents are instructed to indicate how frequently each item was true for them in the past seven days using a five-point, Likert-type response format (1 = never, 2 = rarely, 3 = occasionally, 4 = often, and 5 = very often). The wording of instructions and the stems of stressor-specific items are designed such that the traumatic stressor is identified as clinical work with traumatized clients in order to minimize the possibility that respondents will endorse items based on an experience of direct traumatization. The STSS is comprised of three subscales, referred to as Intrusion, Avoidance, and Arousal, that respectively correspond to the B, C, and D criteria for PTSD (APA, 2000).

Scoring of the STSS requires summing the scores on each item to obtain a total score. Scores for each subscale can also be obtained by summing only the items assigned to the respective subscale. No reverse scoring of items is required. Bride (2007) provides guidelines to interpret responses to the STSS based on percentiles whereby a total score at or below the 50th percentile (less than 28) is interpreted as little or no secondary traumatic stress, scores at the 51st to the 75th percentile (28–37) is interpreted as mild secondary traumatic stress, scores at the 76th to the 90th percentile (38–43) be interpreted as moderate secondary traumatic stress, scores at the 91st to the 95th percentile (44–48) be interpreted as high secondary traumatic stress, and scores above the 95th percentile (49 and above) be interpreted as severe secondary traumatic stress. A second approach to interpreting STSS scores is to use 38 as a cutoff score, such that a score of 38 or above indicates that steps need to be taken to address secondary traumatic stress (Bride, 2007). Lastly, an algorithm approach to interpreting the STSS can be used to screen for the presence of PTSD due to secondary exposure (Bride, 2007). Using the algorithm approach, if an individual endorses at least one item (at 3 or above) on the Intrusion subscale, at least three items on the Avoidance subscale, and at least two items on the Arousal Subscale then that individual may be experiencing PTSD at a diagnostic level due to secondary traumatic stress. However, it is important to underline that the STSS is a screening measure and does not take the place of a thorough clinical interview. Internal consistency estimates for the STSS and its subscales are as follows: Total score = .93, Intrusion subscale = .80, Avoidance subscale = .87, and Arousal subscale = .83. The STSS has demonstrated construct validity through convergent, discriminant, and factorial analyses (Bride et al., 2004; Ting, Jacobson, Sanders, Bride, & Harrington, 2005).

Impact of Event Scale (IES) and Impact of Event Scale-Revised (IES-R)

Although designed to measure directly, rather than secondarily, experienced trauma, both the IES and the IES-R have been used in studies of compassion fatigue in service providers. The IES (Horowitz, Wilner, & Alvarez, 1979) is the most widely used measure of traumatic stress symptomology (Weiss, 2004), having been designed to measure the experience of subjective distress related to a singular traumatic experience. The measure is composed of two scales: Intrusion and Avoidance. The Intrusion Scale is composed of seven items that assess unwanted thoughts and images, dreams, waves of feelings, and repetitive behavior that are related to the stressor. The Avoidance Scale is composed of eight items that assess blunted sensation, behavioral inhibition, and awareness of emotional numbness. Responses are provided based upon a somewhat unusual 4-point Likert Scale, where 0 = not at all, 1 = rarely, 3 = sometimes, and 5 = often that asks respondents to indicate how often they experienced symptoms in the past week. A score of 26 on the combined Intrusion and Avoidance Scales has been suggested as a cut-off for clinically significant reactions (Horowitz et al., 1979). Across 18 published studies using the IES, unweighted averages for coefficient alpha were reported as .86 for intrusion and .82 for avoidance (Sundin & Horowitz, 2002). In addition, sufficient evidence of the construct, convergent and clinical validity of the IES has been reported (Sundin & Horowitz, 2002). However, a common criticism of the IES is that it fails to measure a third aspect of traumatic stress symptomology, that of hyperarousal experiences. For this reason the IES-R may be preferable for assessing trauma symptoms.

The IES-R (Weiss, 2004; Weiss & Marmar, 1997) was developed to build upon the usefulness of the original IES by adding items that could track responses in the domain of hyperarousal. Seven additional items were added to the 15 items of the original IES—6 items to tap the domain of hyperarousal and 1 to parallel the DSM-III-R diagnostic criteria for PTSD. Estimates of internal consistency are reported as: Intrusion = .89, Avoidance = .84, and Hyperarousal = .82. In addition the IES-R has demonstrated good evidence of convergent and discriminant validity. It should be noted, however, that the IES and IES-R are designed to assess symptoms related to direct experiences of trauma. The instructions ask respondents to anchor their responses to a particular traumatic event. In utilizing these measures to assess compassion fatigue, it is important that the respondent is clear that they are referring to their clinical work with traumatized clients as the trauma. Without this specification one may inadvertently be measuring other traumas that were experienced directly. Because the IES and IES-R were designed to measure the impact of traumatic events that were directly experienced, estimates of their reliability and validity are primarily from studies of directly traumatized individuals. As such, their reliability and validity in the measurement of compassion fatigue has not been fully established.

Trauma and Attachment Belief Scale (TABS)

The TABS (Pearlman, 2003), formerly known as the TSI Belief Scale (TSI-BS; Pearlman, 1996), is a measure based in constructivist self development theory. The current 84-item TABS assesses disruptions in cognitive schemas reflecting the following five areas of psychological need: Control, Esteem, Intimacy, Safety, and Trust. Using Likert-scale scoring (1 = disagree strongly to 6 = agree strongly), the TABS yields a total score as well as ten subscales which measure each of the psychological need areas in relation to self and other: (1) Self-Safety, (2) Other-Safety, (3) Self Trust, (4) Other-Trust, (5) Self-Esteem, (6) Other-Esteem, (7) Self-Intimacy, (8) Other-Intimacy, (9) Self-Control, and (10) Other-Control. The scale is designed to identify psychological themes in trauma material, as well as interpersonal and intrapersonal themes that are likely to emerge within the therapeutic process. The TABS was designed for use with individuals who have directly experienced traumatic events, however, it has also been used by researchers to assess the effects of vicarious traumatization. Higher scores indicate more disturbances of beliefs, however, guidelines for interpreting scores with a cut-point or score ranges have not been published.

Pearlman’s (1996) review of unpublished studies of the TSI-BS reported overall internal consistency reliability (Cronbach’s alpha) of .98, with subscale reliabilities ranging from .77 (other control) to .91 (self-esteem). However, other studies have produced lower subscale reliabilities, ranging in one study from .68 to .84 (Schauben & Frazier, 1995) and ranging from .62 to .83 in another study (Jenkins & Baird, 2002). The construct validity of the TABS was supported through tests of convergent and discriminant validity (Jenkins & Baird, 2002), although other investigators have found less support for convergent, discriminant, and factor validity of the TABS (Adams, Matto, & Harrington, 2001; Matto, Adams, & Harrington, 2000).

World Assumptions Scale (WAS)

The WAS (Janoff-Bulman, 1989) is a 32-item self-report scale designed to measure changes in cognitive schema associated with trauma. This instrument was originally intended to assess changes in the worldview of individuals who had been directly traumatized, however, given that the concept of vicarious traumatization is at least partly rooted in Janoff-Bulman’s (1989) assumptive world theory, it follows that it is an appropriate measure for monitoring the cognitive distortions that may occur. The WAS contains three subscales that correspond to distinct worldview domains. Benevolence of the World consists of beliefs about the balance of good and misfortune in the world, as well as, beliefs about benevolence among people. Meaningfulness of the World consists of beliefs about justice, controllability of outcomes, and the role of chance. Self as Worthy consists of beliefs about self-worth, role of personal behavior in outcomes, and sense of personal luck. Respondents are asked to indicate their level of agreement with each item, from strongly disagree to strongly agree, using a 6-point Likert response scale. Subscale scores are computed by summing the items corresponding to each scale and a total score can be derived by summing the subscale scores. No reverse scoring is required. Guidelines for interpreting scores with a cut-point or score ranges have not been published. The WAS has demonstrated evidence of adequate internal consistency with alphas of .82 for benevolence of the world, .74 for meaningfulness of the world, and .77 for self as worthy (Janoff-Bulman, 1989). Janoff-Bulman (1989) reports good factorial validity and construct validity for the WAS.

Discussion

Clinicians and clinical supervisors should consider a number of factors prior to selecting a particular instrument to measure compassion fatigue. One consideration is the domain that one intends to assess. Each instrument reviewed above measures specific aspects of compassion fatigue and serves as an important screening tool. For example, the STSS specifically measures PTSD symptomology associated with clinical work with traumatized populations (Bride, 2007; Bride et al., 2004), while the TABS specifically measures disruptions in cognitive schemas in five areas of psychological need (Pearlman, 2003). Clinicians and supervisors should be clear what aspects of compassion fatigue are most important to monitor in their case and use the instrument likely to uncover potential compassion fatigue. Supervisors whose clinicians work with traumatized populations, for example, may find particular utility in the STSS. However, no single compassion fatigue measure assesses all aspects of the concept of compassion fatigue (i.e., trauma symptoms, cognitive distortions, general psychological distress, burnout, etc.). As such, it is recommended that more than one measure be utilized by clinicians as well as within organizations in order to provide a fuller picture of an individuals’ experience of compassion fatigue.

In addition there are structural differences amongst the instruments that have importance for interpreting scores. One difference has to do with timeframe. Some measures ask respondents to report their symptoms/experiences in the past week (i.e., STSS, IES/IES-R) while others use a timeframe of 30 days (i.e., ProQOL), and still others do not specify a time frame (i.e., CSFT, TABS, WAS). From a psychometric viewpoint, a shorter timeframe, such as one week is more likely to assess current levels of compassion fatigue whereas longer timeframes may reflect compassion fatigue experiences that are recent, but not necessarily current. Again, clinicians should select an instrument with an appropriate time frame to illuminate relevant aspects of compassion fatigue.

Care should be used in interpreting scores on any of the above measures. All of the instruments reviewed are most appropriate for the screening of compassion fatigue. They do not take the place of an extensive clinical assessment by a professional knowledgeable and experienced in the recognition of compassion fatigue. Further, because these measures are intended as screening instruments many of the scoring guidelines are conservative. That is, they were developed to ensure that compassion fatigue is identified where appropriate and minimize the possibility that compassion fatigue would not be identified in someone who is experiencing it (false negative). The trade-off with this approach is that it may inflate the probability of false positives— obtaining scores that reflect high levels of compassion fatigue, that in fact are not experiencing compassion fatigue.

In conclusion, this manuscript has provided a summary and review of the most commonly utilized instruments for measuring different aspects of compassion fatigue. The goal was to provide a resource for clinicians to assist in choosing an instrument to assess and monitor their own levels of compassion fatigue. Each instrument reviewed has varying levels of evidence regarding its psychometric properties and each is useful for specific purposes. As noted earlier, compassion fatigue is viewed as occupational hazard of clinical work with traumatized clients. It is expected that most clinicians will at times experience symptoms of compassion fatigue, as these are normal reactions to trauma work. However, for some clinicians the experience of compassion fatigue may become so severe as to interfere with their clinical effectiveness and their personal mental health. It is for this reason that ongoing monitoring is necessary.