Introduction

Self-management interventions are aimed at providing patients with the necessary knowledge, skills, and confidence to effectively manage their condition themselves. The effectiveness of such interventions is typically evaluated by a wide range of clinical severity measures, self-reported symptoms, and presumed psychological mediators such as self-efficacy [13]. To date, however, there is no agreement on the actual set of attributes that are important to managing and participating in healthcare and on how to measure these [4]. This makes it difficult to evaluate direct effects on patient skills and to compare the results of various interventions.

To address this issue, the 17-item Effective Consumer Scale (EC-17) was recently developed based on extensive literature reviews, expert and patient interviews and pilot testing [5]. A follow-up study explored its construct validity and responsiveness in participants in the arthritis self-management program (ASMP) [6]. Results showed that the EC-17 addressed skills and behaviours not covered by other relevant scales such as the Health Education Impact Questionnaire [7], Patient Activation Measure [8], and Arthritis Self-Efficacy Scale [9], including identifying quality information and negotiation with health professionals [6]. Moreover, although the ASMP was not tailored to all behaviours measured by the EC-17, the scale was modestly sensitive to change [6]. A similar study examining the Norwegian EC-17 also showed that the scale was easy to complete, internally consistent, reproducible, valid, and moderately responsive to change [10]. The aim of this study was to cross-culturally adapt the EC-17 for use in Dutch patients with musculoskeletal conditions and to evaluate its psychometric properties.

Materials and methods

Cross-cultural translation

Cross-cultural adaptation followed established forward–backward translation procedures [11]. The prefinal EC-17 was cognitively pretested in five patient research partners (four female, age range 29–74 years) with different rheumatic conditions. Pretests were carried out using the three-step test interview method [12]. Based on the results, small wording changes were made in six items (e.g., ‘arrange’ instead of ‘organise’), one response option (‘generally’ instead of ‘usually’), and the instructions (expanded with an explanation of the term ‘management’).

Psychometric evaluation

Participants

A survey was sent in October 2010 to a random sample of 404 patients with osteoarthritis (OA) and 58 patients with fibromyalgia (FM) that had visited the outpatient rheumatology clinic in the preceding year. Two hundred and fifty-three (54.8 %) patients returned a completed survey. The first 120 patients willing to complete the scale a second time were sent a follow-up questionnaire, which was completed by 101 (84.2 %) patients after a median (IQR) time of 20 (18–24) days.

Measures

The EC-17 measures knowledge, attitudes, and behaviours about self-management skills using 17 items with 5-point Likert-type scales (“never” to “always”) [5]. Item scores are summed when ≥14 items are completed and converted to range from 0 to 100, where 100 is the best possible score.

Additionally, patients completed the 5-item Perceived Efficacy in Patient–Physician Interactions scale (PEPPI-5; α = 0.90) [13, 14], the 12-item Dutch General Self-Efficacy Scale (GSES; α = 0.80) [15, 16], the 4-item support from family and friends subscale from the Arthritis Impact Measurement Scales 2 (AIMS2; α = 0.91) [17, 18], and the 36-Item Short Form Health Survey (SF-36v2) [19, 20], from which the physical and mental component summary (PCS and MCS) scores were calculated [21]. Pain in the last week was measured on an 11-point numerical rating scale (NRS) from 0 (‘no pain’) to 10 (‘unbearable pain’).

Data analysis

Fifteen patients had >3 missing values on the EC-17 and were excluded from further analyses (final response rate 51.5 %). Remaining missing values were low, with a maximum of five (2.1 %) for items 10 and 16, and were imputed with their median values.

Unidimensionality of the EC-17 was tested using robust maximum likelihood confirmatory factor analysis (CFA) [22]. Non-normed (NNFI) and comparative fit (CFI) indexes ≥0.95 and standardized root mean square residual (SRMR) and root mean square error of approximation (RMSEA) ≤0.08 and 0.06, respectively, were considered indicative of good fit [23, 24].

Additionally, Rasch partial credit model analyses were performed [25]. Conservative infit values between 0.87 and 1.13 and outfit values between 0.61 and 1.39 were considered to indicate acceptable item fit [26]. Items with residual correlations >0.30 were considered locally dependent [27, 28]. Differential item functioning (DIF) was evaluated across sex, age, and disease duration and considered present when the difference between the item calibrations was statistically significant and >0.5 logits [25, 29]. Person reliability ≥0.70 and ≥0.85 was considered adequate for group-level and individual comparisons, respectively [28]. The person-item map and test information function were examined for mistargeting and local measurement precision [30].

Reproducibility was assessed by intraclass correlation (ICC, type A,1) [31] and considered adequate for group-level and individual measurements over time when ≥0.70 and ≥0.90, respectively [32].

For convergent and discriminant validity, it was hypothesized that an adequate measure of perceived health-management skills should be strongly correlated with perceived effectiveness in patient–physician interaction, which is an important aspect of general health-management skills, and moderately correlated with the conceptually related construct of general self-efficacy and social support [3335]. Finally, a moderate correlation with psychosocial health (SF-36 MCS) and weak correlations with physical health (SF-36 PCS) and pain were expected [36].

Results

Patient characteristics

Patient characteristics are reported in Table 1. There were no significant differences with respect to age or sex between the respondents and non-respondents. FM patients differed significantly from OA patients on several socio-demographic variables and scored worse on all scales, except the GSES and SF-36 PCS.

Table 1 Patient characteristics

Distributional properties

Total scores on the EC-17 showed a near-normal distribution (Kolmogorov–Smirnov, P = 0.058) with skewness and kurtosis values of −0.72 and 0.74, respectively (Fig. 1). Floor and ceiling effects were absent, with no patients scoring zero and only three patients (1.3 %) scoring 100.

Fig. 1
figure 1

Distribution of EC-17 total scores

Unidimensionality

With the exception of RMSEA, the one-factor model showed a good fit (SBχ2(119) = 488.70, NNFI = 0.96, CFI = 0.96, SRMR = 0.08, RMSEA (90 % CI) = 0.11 (0.10–0.12). Standardized factor loadings were high for all items (Table 2).

Table 2 Standardized factor loadings and Rasch item parameters and fit statistics of the EC-17 ordered by difficulty level

Rasch measurement properties

The EC-17 adequately fit the Rasch model. Five items showed infit values slightly outside the range of 0.87–1.13, and no items showed poor outfit (Table 2). Residual correlations revealed some redundancy or multidimensionality, as demonstrated by 4 pairs of items showing positive (r’s between 0.33 and 0.40) and 4 showing negative local dependence (r’s between −0.31 and −0.32). No items showed DIF across sex and disease duration and only one item across age.

Internal consistency was sufficiently high (person reliability = 0.92) for individual-level comparisons. Item difficulty estimates ranged from 0.67 to −0.68 logits and tended to cluster around the middle of the scale, with a large proportion of patients with relatively high skills not being covered by any individual item (Fig. 2). The mean logit score for patients was 1.25, indicating that the sample as a whole was located at a higher ability level than the mean item difficulty.

Fig. 2
figure 2

Distribution of person abilities and item difficulties across the scale. Higher positive logit scores indicate better self-management skills and more difficult items. Mean person ability = 1.25 (SD = 1.74); mean item difficulty = 0.00 (SD = 0.39)

The information curve (Fig. 3) was peaked at lower levels of the underlying trait, indicating that patients with skills below the mean are measured with more precision than individuals with better skills. Measurement precision was sufficient for group-level analyses across a wide range of the underlying trait, but adequate for individual-level comparisons in persons with moderate and lower levels of self-management skills only.

Fig. 3
figure 3

Test information curve of the EC-17 in relation to the Rasch score. Higher positive logit scores indicate better self-management skills and attributes. Test information values of 3.33 and 6.67 (dotted lines) correspond to a reliability of 0.70 and 0.85, respectively. Logit values of −6, 0, and 6 correspond to approximate total scores on the EC-17 of 1, 59, and 98, respectively

Test–retest reliability

With an ICC of 0.71 (95 % CI: 0.60–0.80), test–retest reliability of the scale was adequate for group-level comparisons.

External construct validity

As expected, the EC-17 correlated strongly with perceived efficacy in patient–physician interactions, moderately with social support and psychosocial aspects of health, and weakly with physical aspects of health and pain (Table 3). The association with general self-efficacy was just below the cut-off value for moderate correlation.

Table 3 Pearson correlations between the EC-17 and other measures in total sample

Discussion

EC-17 scores were normally distributed, and the results of both CFA and Rasch analysis supported the unidimensionality of the scale, indicating that item scores can be summed to create a single total score. The latter is in accordance with previous studies that used principal component analyses [6, 10].

The high internal consistency of the EC-17 corresponds to the ability to discriminate between 3 and 4 distinct levels of skills [25]. However, measurement precision was not equally high across the underlying trait. On a group level, the EC-17 had sufficient precision across a wide range of scores. However, it was adequate for individual-level comparisons only in persons with moderate and lower levels of skills. Although it may be desirable to have a measure that specifically targets patients with lower skills, this also suggests that the EC-17 may lack discriminatory power in patients with relatively high levels of skills. Since a sample size of approximately 240 persons has been shown to provide accurate estimates of item and person locations in Rasch analyses, even for measures with poor targeting, these results are likely to be quite robust [37]. However, no other studies have used Rasch analyses for the current 17 items, and this finding should be further investigated in other populations.

Test–retest reliability was adequate, but lower than previously found [10]. It is possible that we used a more strict ICC model [38] or that the time interval was too long to assure that no inter-individual variation occurred.

Finally, with the exception of general self-efficacy, all hypothesized correlations were confirmed, supporting the convergent and discriminant validity of the scale.

Given the relatively low response rate and the differences in both demographic and clinical characteristics between the OA and FM patients, the current findings should be interpreted with some caution and be cross-validated in other samples of musculoskeletal patients.

In conclusion, this study suggests that Dutch EC-17 is a valid and reliable measure of effective health consumer skills in patients with musculoskeletal conditions. Future studies should further examine its sensitivity to change in a clinical trial specifically aimed at improving the skills and behaviours deemed necessary for effective consumers, before the scale can be fully endorsed as an outcome measure for evaluating self-management interventions.