Introduction

This paper focuses on the process by which persons recover those parts of the self and life that have been compromised by participation in mental health treatment systems and the stigmatizing culture in which treatment is embedded.1 In the development of a psychiatric disability, individuals experience a series of losses, including loss of hope, personal power, uniqueness, motivation, identity, self-respect, self-acceptance, relationships, and the opportunity to learn skills for making effective choices.2 While some losses are best addressed by individuals or with individual interventions—such as awakening hope, exploring identity, enhancing connections with others, reducing harm, and learning the skills of making effective choices—others are better addressed by seeking to increase the recovery-orientation of treatment program culture. The culture, and the interpersonal interactions that comprise it, become a primary intervention facilitating individuals’ recovery from serious mental illnesses.

During the era when state mental hospitals and therapeutic communities predominated, research often focused on ward culture,3 and this anthropological approach was extended to the early PACT program in Madison.4 In a National Association of State Mental Health Program Directors study investigating what helps and what hinders recovery, Onkin and colleagues extend this cultural approach by focusing on the role of the “organizational culture” of formal mental health services.1

This article presents and tests the Recovery Centered Measures (RCM scales), which have been developed to measure the recovery-orientation of cultural beliefs, values, norms, and practices in treatment programs as they are expressed in interactions between staff and consumers and between staff with other staff.

The scales measure how much programs vary in the extent to which staff manifest respect for persons-served, recognize their individual uniqueness, increase their motivation, accept them non-judgmentally, and share power with them. These five dimensions come directly from Onkin and colleagues, but are echoed many places in the literature.1, 2, 57

The RCM scales are designed to be used to measure change in program culture and to compare the recovery-orientation of different programs. They are also used to focus culture improvement initiatives. The three scales have different viewpoints: staff views of their interactions with persons-served (RCM-Staff View), consumer views of their interactions with staff (RCM-Consumer View), and staff views of their interactions with other staff (RCM-Staff View of Staff). The two studies presented here describe the psychometric properties of all three RCM scales.

To our knowledge, no other instruments specifically focus on staff views of their own culture, although there is a large literature on treatment milieus from which one can infer the importance of staff interactions with each other. For example, the Ward Atmosphere Scale8 includes items, responded to by both staff and clients, like this: “Staff sometimes argue openly with each other.” Our attention to the importance of the intra-staff dimension was triggered by receiving a letter from a patient in 2002 that included this statement: “In 35 years in the Mental Health desert, probably the greatest desert of all Mankind, this is the 1st place I have been where the workers care about the Happiness of their fellow workers and the Happiness of the Patients.” Some empirical confirmation is found in a study of inpatient units that found a strong correlation between patient satisfaction on one hand and staff satisfaction and staff working conditions on the other.9 Likewise, psychiatric inpatient seclusion rates have been found to depend partially on such factors as trust between staff.10

Methods

Scale construction

Based on literature analysis by the RCM scale developers, a pool of 235 items was generated for the three versions of the scale. The items were organized by the five components of recovery described above, with an average of 11.5 items for each concept in each scale. To the extent possible, the three scales were designed to be parallel. The scale in which staff assess culture among themselves is necessarily somewhat different in concept and wording.

A review of the initial items by two separate groups of persons-served and by two groups of line staff led to rewording as well as elimination of some questions. The number of items in each subscale was further reduced in the course of two pretests (N = 61 and N = 452) by dropping items with lower correlation to other items and items that, in exploratory factor analysis of subscales, did not load on the primary factor. The number of response categories was also tested and modified in these early trials. The final response categories are as follows: Strongly Agree, Moderately Agree, Agree, Disagree, Moderately Disagree, and Strongly Disagree. Some items perform better in one rather than another RCM scale, so the final item selection reflects a balancing approach. A third pretest of the scales with no items reversed (N = 258) showed unacceptably high levels of positive response set, so the current version has eight reversed items out of 25. See Table 1 for the actual items and subscales of the RCM-Consumer View scale and for item properties on all three scales.

Table 1 Recovery centered measure item meansa and item correlations with the test (from study 2)b

The first study reported here included one large facility-based and five ACT programs; in study 2 there were two facility-based and two ACT programs. Study 2 differed from study 1 in two ways: (a) one item that had a low correlation to other items in study 1 was replaced and a number of small wording changes were introduced, and (b) one of the instruments used for convergent validity in facility-based programs in study 1 was replaced.

During two pretests, developers had to resolve a number of questions about scale administration for persons-served including how to distribute and get back the test forms (more of an issue in ACT programs) and how to deal with low literacy, vision problems, or cognitive limitation. During the pretests, it was also decided to include indirect staff and administrators as well as treatment staff in the RCM term “staff.” An administration manual is available from the authors.

Instruments for concurrent validity

Two recent reviews11, 12 identify six instruments designed to capture the perspectives of persons-served regarding the recovery-orientation of the care being provided. For determining convergent validity, the two scales among the six that are most focused on program culture and have some tested psychometric properties were chosen. The Recovery Enhancing Environment (REE) instrument was developed by Patricia Ridgway with much consumer input. It has been field-tested in two studies, but only Cronbach’s alpha has been reported.13, 14 There are several scales included in the instrument; the 14-item Organizational Climate scale is used here. Although originally intended to be answered by consumers, the items can be answered by staff with no translation or modification.

The Recovery Self-Assessment scale is intended to measure the extent to which programs implement recovery-oriented practices. The Recovery Self-Assessment scale includes versions for staff, families, administrators, and consumers. Although a field test of the instrument was published by the authors, the only psychometric data from the study is an exploratory factor analysis.15 A replication study of only the staff version showed high internal consistency and, in a sample of 34 staff members, a test–retest intraclass correlation of 0.83.16 The Recovery Self-Assessment has been used to test staff change after a year-long training effort on recovery.17 In a study of 67 Canadian ACT programs, the Recovery Self-Assessment was used to assess the relationship between recovery-orientation and outcomes and recovery-orientation and ACT fidelity.18, 19

A third instrument is the WARD Atmosphere Scale, a widely used scale developed by Rudolph Moos in the 1970s to measure organizational and cultural features of therapeutic communities. It is supported by extensive psychometric data.8

Psychometric data on all three of the instruments used to measure convergent validity was also collected in our own studies. Results are shown in Table 2. Internal reliability was adequate for all three instruments. Test–retest intraclass correlation was adequate for the REE; it was quite low for the RSA and was not tested for the Ward Atmosphere Scale.

Table 2 Psychometrics of instruments used to test convergent validity (as determined in this study)

Study participants

The primary goals in selecting programs to participate were to include both facility-based and ACT programs and to balance numbers of staff with numbers of persons-served. All the programs in both studies are operated by a large California provider. The first study included 201 staff and 201 persons-served; the second study included 220 staff and 227 persons-served. In large programs, persons-served were sampled randomly to obtain a study group roughly equal in size to the number of staff. The study refusal rate for persons-served was 9.1 % in study 2 (but not recorded consistently for study 1). Persons-served who were judged by staff to be too impaired by symptoms to complete the study forms were excluded as were persons with limited English language capability. (At this time the RCM forms are available only in English.) A method was used to permit linking of test and retest forms to the same persons while still preserving anonymity.

The test–retest reliability sample was a subsample of the participants in the second study. Of the 176 consumers who were requested to take part in the retest, 59 (34 %) chose not to participate. Table 3 and Table 4 summarize the characteristics of staff and consumer study participants.

Table 3 Characteristics of staff
Table 4 Characteristics of persons-served

Results

Scale administration

In final form, the consumer RCM scale Flesch–Kincaid reading level is 5.4. The average time required to complete the 25 items of the consumer RCM and the 50 items of the staff RCM scales in the two studies ranged from 6 to 13 min. Times were greater in the second study, due in part to an older and more impaired population in the study 2 residential facilities. Table 5 presents detailed data on time of administration and on the scale psychometrics discussed later.

Table 5 Performance of the RCM in study 1 and study 2a

Correlational and subscale structure

Each RCM scale contains 25 items which are hypothesized to represent an underlying latent dimension of recovery-oriented culture. Factor analysis on the items in each scale was conducted using the principal factors method. The number of retained unrotated factors was determined using a screeplot. For the two staff-completed scales, in both study 1 and study 2, there is one predominant factor and one subsidiary factor (eigenvalue 2.0 or less) that loads predominantly or entirely on the reversed items. On the persons-served scale, in study 1 there is one predominant factor and a second factor (eigenvalue 3.23) which loads entirely on the reversed items; in study 2 there is one predominant factor and one secondary factor (eigenvalue 3.99) which again only loads on the reversed items.

By design, the RCM contains five subscales of five items each: individual uniqueness, non-judgmentalism, shared power, motivation, and respect. The overall scale was developed to measure these five conceptual aspects of recovery culture. An exploratory factor analysis of each subscale was conducted using the principal factors method and found no more than one factor (eigenvalue of 1.0 or more) in each subscale. Items also correlated higher with their own subscale than with any other subscale (out of all 75 correlations of each item with five subscales in three scales, there were five exceptions to this finding in study 1 and 3 exceptions in study 2). However, the correlations between subscales of each RCM scale are relatively high: RCM-Consumer View correlations range from 0.69 to 0.79; for the RCM-Staff View, the range is 0.63 to 0.83; and for the RCM-Staff View of Staff, the range is 0.54 to 0.73. This amount of correlation indicates the single underlying latent variable is a strong component of each subscale.

Reliability

Internal consistency is measured with Cronbach’s alpha. The staff view of staff RCM scale has an alpha of 0.93, with subscales ranging from 0.71 to 0.84. The staff view of interactions with consumers RCM scale has an alpha of 0.95, with subscales ranging from 0.74 to 0.88. The consumer RCM rating scale had an alpha of 0.93, with subscales ranging from 0.55 to 0.81 (see Table 5). Thus, internal consistency is high for the scales and acceptable to good for the subscales.

In study 2, the three RCM scales were administered twice. For the consumer RCM scale there were 96 participants with time between test and retest averaging 13.4 days (range 7–21). The test–retest intraclass correlation was 0.67. For the two staff RCM scales, 128 participants averaged 14.1 days between administrations (range 7–20). The intraclass correlation of test with retest for the staff view of consumer scale was 0.72; for the staff-view-of-staff scale, it was 0.81 (see Table 5). These test–retest correlations are also acceptable to good.

Validity

Several approaches suggest the RCM scales are valid measures of program culture in relationship to furthering or hindering recovery of persons-served. The RCM items and subscales have face, content, and construct validity due to their genesis in consumer writings and research about recovery and to the use of consumers and line staff in refining and selecting items. RCM convergent validity was judged by comparing RCM scores to those of the three related instruments described above. Correlations between the Ward Atmosphere Scale and the RCM scales range between 0.30 and 0.57. The REE Organizational Climate scale correlates with the staff-view and consumer-view RCM scales between 0.56 and 0.71. The third convergent validity instrument is the broader 32-item Recovery Self-Assessment scale, which has a consumer and a staff version. Correlations between the RCM scales and the Recovery Self-Assessment scale ranged between 0.39 and 0.72. Please see Table 5. These correlations indicate substantial sharing of content but not redundancy.

Another measure of validity is the concordance between ratings made by staff and by persons-served. Since the presumption of the study is that there is an underlying latent variable of “program culture which does or does not support recovery”, it is hypothesized staff and persons-served will rate the same items high or low and to the same degree. Correlation between staff and consumer means when rating the same items is high (Pearson’s r = 0.78). However, agreement is reduced by the fact that staff consistently rate items somewhat higher than persons-served (bias). The product of the correlation coefficient and a measure of the bias is summarized in the correlation of concordance, which for the RCM is 0.65.20

Finally, validity is confirmed by the finding that the RCM scales discriminate between program types. Scores for both consumer and staff RCM scales were expected to be lower for facility-based programs than in ACT programs, and, as shown in Table 5, scores are lower in facilities in both studies.

Discussion

Measuring recovery-orientation

There is considerable interest in the behavioral health community in measuring the recovery orientation of programs.11, 12 It is useful to compare the RCM scales with the two instruments these authors reviewed which most focus on program culture. These are the same instruments used for testing convergent validity: the REE Organizational Climate scale and the Recovery Self-Assessment scale (RSA).

Correlations of the RCM scales with these related scales are substantial, 0.30 to 0.72, but not high enough to suggest the RCM scales measure the same latent dimension. By way of comparison, Andresen and colleagues recently looked at four individual recovery self-rating scales and found correlations ranging between 0.40 and 0.89.21

The REE Organizational Climate scale and the RCM scales are relatively quick to administer; the RSA is somewhat longer (A very short consumer version has recently been developed but 205 psychometrics have not been published.) All three were developed with considerable consumer input. The RCM has a reading level of Flesch–Kincaid 5.4; the REE reading age is 12–13 and that of the RSA 15–16.12

The Recovery Self-Assessment and the RCM scales have both consumer and staff versions. The RSA additionally has an administrator and family version while the RCM has a version measuring staff views of their own culture. The REE Organizational Climate scale was developed for consumer rating, but data presented here shows that it has good psychometric properties when rated by staff as well.

The content of the Recovery Self-Assessment scale is considerably broader than program culture. For example, items include “The physical space of this program (e.g., the lobby, waiting rooms, etc.) feels inviting and dignified” and “I am encouraged to attend agency advisory boards and/or management meetings if I want.” The RCM focuses specifically on concepts important to recovery as they manifest themselves in staff–consumer and staff–staff interactions. Most REE Organizational Climate items also focus on interactional culture, but there are items on physical appearance and program resources and consumer satisfaction as well.

In summary, if program or organizational culture is the interest, the RCM scales are the most focused, with the REE Organizational Climate and the Recovery Self-Assessment scales following in that order. The RCM scales and subscales have more conceptual clarity than the other two scales; the RCM subscales also have good validity and internal reliability but cannot be interpreted as underlying causal dimensions as would be the case if they were defined by confirmatory factor analysis (the RSA factors have also not been confirmed). Psychometrically, all the scales are acceptable but psychometric testing of the RCM scales is most complete to date. The only instrument with an administrator and family version is the Recovery Self-Assessment; the only instrument that measures staff interactions with other staff is the RCM.

Limitations

The two RCM studies are limited in several ways. First, instruments used for convergent validity in the studies are not entirely apposite; it is particularly difficult to find appropriate comparisons for the instrument measuring staff view of interactions with other staff. Second, all the programs included in the studies are operated by one agency, so it will be important to test the RCM scales in other contexts. Third, the item order on the RCM scales was assigned randomly, but this may have increased the reversed-item bias as several reversed items were clustered together. Fourth, all the programs in these studies serve persons with very serious functional difficulties (ACT or residential rehabilitation facilities), so it is not yet known if the RCM scales will prove useful in peer-support programs oriented to wellness and recovery. Fifth, the RCM scales have not been tested in other languages or cross-culturally. And finally, it remains to be proven that higher RCM scale scores for programs are associated with better individual recovery outcomes.

Implications for Behavioral Health

Program culture has a profound and pervasive effect on the recovery of individuals served by a program. Changing the culture of a program to better support recovery can be a principal intervention practitioners use to assist the persons they serve. The RCM measures can be employed in several ways to further this effort.

An initial step is to administer all three scales to a program and provide feedback of scores to staff using graphs. The feedback is designed to help motivate improvement by identifying domains, defined by subscales scores, where the program is doing less well and by pointing out specific discrepancies between staff and consumer ratings. For example, in the studies reported here staff learned that they rate their respect for the spiritual beliefs and practices of persons they serve far higher than do those persons themselves.

In addition to its use in individual program cultural transformation efforts, the RCM scales can, like the more broadly focused RSA, be used to profile multiple programs15 or to serve as an intervening or mediator variable in research on attainment of individual recovery goals.22

The imperative for programs to assist persons-served in their journeys to recovery is no longer a matter of debate. The conversation has shifted to how programs can best accomplish this goal. The RCM scales show strong promise as tools for helping programs identify what aspects of organizational culture need to change in order to promote recovery. Uniquely among related instruments they have strong psychometrics, focus solely on interactional culture, and include a scale for measuring the recovery orientation of staff-with-staff interactions.