Introduction

Idiopathic ulnar impaction syndrome is a degenerative condition characterized by ulnar-sided wrist pain and swelling related to load-bearing across the ulnar aspect of the wrist without a history of fracture or premature physeal arrest of the distal part of the radius [2]. Several studies have shown that ulnar shortening osteotomy generally has succeeded at decreasing pain and increasing functional ability in idiopathic ulnar impaction syndrome [2, 5, 6, 19, 22, 25, 29], but clinician-rated outcome measurements (instead of patient-reported questionnaires) have been used in the majority of these studies.

Patient-reported questionnaires currently are favored more than clinician-rated outcome measurements in clinical studies, because they better predict symptom severity and functional disability and better encapsulate patients’ views. The Royal College of Surgeons has recommended that validated patient-reported questionnaires should be used in preference to clinical assessments as primary outcome measures in clinical trials [9].

The DASH questionnaire is one of the most commonly used self-reported instruments for upper extremity disease and injury [15], and ulnar shortening osteotomy for ulnar impaction syndrome has been reported to improve DASH scores [19, 25]. The Patient-Rated Wrist Evaluation (PRWE) questionnaire [23] was developed to specifically reflect wrist pain and function, but to date, it has not been used as an outcome measure of ulnar shortening osteotomy for ulnar impaction syndrome.

The responsiveness of an outcome measure is defined as the ability of the measure to quantify change accurately [11]. Statistical methods that include measures such as effect size and standardized response mean provide approaches to assessing responsiveness. However, statistically significant findings are not always clinically important [3, 13]. For example, one can imagine a large study of a new procedure for carpal tunnel syndrome; in this imaginary study, the DASH score decreased by two points. Because the study included 1000 patients, the small observed improvement might be detected as a statistically significant difference, but it is unlikely that patients would perceive such a small change in their symptoms. To help ascertain whether a statistically significant response is also clinically meaningful, the concept of minimal clinical important difference (MCID) has been developed. The MCID is the smallest difference in score of an outcome instrument that patients perceive as important [8]. In general, the MCID may be established in two ways. One is through an anchor-based method that compares changes in scores on the instrument with an anchor, where the patient indicates whether they believe they are better than at baseline (the anchor). The other is a distribution-based method that evaluates the minimal difference in excess of that expected by random sample variation or by measurement errors in the instrument. In this study, we measured anchor-based and distribution-based MCIDs. To avoid confusion, we use the term “MCID” for the anchor-based method and “minimal detectable change” for the distribution-based method.

The purposes of this study were (1) to compare the responsiveness of the PRWE, DASH, and other physical measures, and (2) to determine the MCID for the PRWE and DASH after ulnar shortening osteotomy for idiopathic ulnar impaction syndrome.

Materials and Methods

We defined idiopathic ulnar impaction syndrome as (1) persistent ulnar-sided wrist pain of more than 3 months’ duration, (2) no history of wrist or forearm fracture, (3) a positive ulnocarpal stress test [10], and (4) a positive MRI sign [4] including chondromalacia or marrow edema of the lunate bone, triquetral bone, or distal ulnar head; thinning or a degenerative tear of the triangular fibrocartilage complex. Using those inclusion criteria, we identified 39 wrists in 36 patients with idiopathic ulnar impaction syndrome who were treated by ulnar shortening osteotomy between March 2008 and February 2011.

We excluded patients who (1) were younger than 18 years, (2) had radiographic evidence of an old fracture of the forearm or wrist, or (3) had a Madelung deformity. We further excluded three patients who had bilateral surgery and five patients who lacked complete preoperative and 12-month postoperative outcome measurement data, resulting in a study group of 28 patients (18 men and 10 women) with an average age of 41 years (range, 19–59 years).

The surgical procedures were performed as previously described [2, 19] with a six-hole low-contact dynamic compression plate (Synthes, Paoli, PA, USA) or seven-hole recon plate (Synthes) using transverse osteotomy. Postoperatively, patients were treated with a short arm splint for the first 4 weeks. Subsequently, a short arm brace was applied allowing active wrist motion but restriction of exertional activities for another 4 weeks.

Patients completed self-reported questionnaires within 1 month before surgery (baseline assessment) and at 12 months after surgery (follow-up assessment). We selected 12 months for the followup assessment time in this study, because more than 3 months usually is required for bone consolidation after an ulnar shortening osteotomy. Self-reported questionnaires included the DASH [15] and PRWE [23].

DASH scores are rated on a 0 to 100 scale, in which higher scores indicate greater disability. The questionnaire consists of 30 items, each with five possible responses. Twenty-one items address degree of difficulty performing different physical activities, six address symptoms, and the rest address psychosocial effects [17].

The PRWE score, likewise, ranges from 0 (a perfectly well-functioning wrist free of pain) to 100 (a completely disabled, painful wrist) [24]. The questionnaire consists of 15 questions divided into two subscales that assess pain (PRWE-P; five items) and function (PRWE-F; 10 items). The function subscale is further subdivided into special activities (PRWE-SF; five items) and normal activities (PRWE-UF; five items). Questions are scored on a 10-point categorical scale ranging from no pain or difficulty (0 points) to worst pain imaginable or unable to do (10 points). PRWE pain scores are the sums of five items (a maximum score of 50), and PRWE function scores are the sums of the 10 item scores divided by two. Pain and function scores then are summed.

Physical measurements were performed by a trained physiotherapist (JMN) within 1 month before surgery and at 12 months after surgery. Physical measurements included grip strength and wrist ROMs. Grip strengths were measured using a Jamar dynamometer (Sammons Preston, Bolingbrook, IL, USA) with the elbow flexed at 90° and the forearm in neutral rotation. Values are expressed as percentages of the corresponding values of the contralateral (uninjured) wrists. Regarding grip strength calculations, we allowed for 10% greater strength of the dominant hand when the right hand was dominant, but we did not compensate when the left hand was dominant [7, 26]. Wrist ROMs (extension, flexion, supination, and pronation) were measured using a handheld goniometer and are expressed as percentages of contralateral wrists.

To determine the MCID, we used the following question as a transition item for patient-perceived improvement after surgery as an anchor against which we compared the other outcomes instruments 12 months after surgery: “Do you feel that your wrist was improved by surgery?” Patients chose from the following possible response options: much improved, slightly improved, no different, and worsened.

Data were presented as mean and SD at each assessment, and mean change and SD of mean change from preoperative scores to postoperative scores. We used paired t-tests to assess the statistical significance of any change. Statistical significance was accepted at the 5% level throughout.

The sensitivity of each outcome measure was evaluated by calculating effect size and standardized response mean. Effect size represents the extent of change identified by an instrument in a unitless way to facilitate direct comparisons between instruments [18] and is calculated by dividing the mean change in a score during a specified interval by the SD of the baseline score [9]. Values of 0.2, 0.5, and 0.8 are regarded as indicating small, medium, and large degrees of change, respectively [9]. We calculated standardized response mean (SRM) values by dividing mean interval change by the SD of the interval change. As reported by Kotsis et al. [20], Cohen considered an SRM of 0.2 small, 0.5 medium, and 0.8 large [20].

MCIDs for the DASH and PRWE scores were determined by an anchor-based method using receiver operator characteristic (ROC) curves. For the purposes of defining the MCID using our defined anchor questions, we considered a patient improved if he or she responded either “much improved” or “slightly improved,” and we considered a patient unimproved if he or she responded “not different” or “worsened.” ROC curves plot sensitivity (y-axis) against 1-specificity (x-axis) for all possible cutoff points of the instrument. Sensitivity is defined as the number of patients who improved divided by the number of all patients with a score change above the cutoff point. Specificity refers to the number of unimproved patients divided by the number of all patients with a score change below this cutoff point. The most efficient cutoff value, with regard to specificity and sensitivity, indicates MCID. The greater the area under the ROC curve, the greater the ability of the scale to differentiate between those with and without a clinically important change. If the area under the curve is 0.5, the test is not predictive, whereas an area close to 1.0 indicates better differentiation [28].

Minimal detectable change (distribution-based method) is defined as the smallest amount of change between two times that indicates a real change in health status [28], that is, a change above that expected by measurement error. For a conventional confidence level of 90%, the minimal detectable change is calculated as 1.65 × √2 × SEM. SEM is the error estimate for single use of the questionnaire and is directly related to the reliability of the scale. It is calculated using the formula SEM = SD × √1 − α, where SD is the SD of the pretreatment score and α is the reliability coefficient of the questionnaire. The reliability coefficient of the questionnaire can be used as the intraclass coefficient from test-retest studies, or Cronbach’s alpha. In this study, we used Cronbach’s alpha for the reliability coefficient of the questionnaire.

We used ANOVA and linear regression test to examine the associations between change of the DASH and PRWE scores and patient-perceived overall improvement.

Results

Patients’ mean outcomes scores and all physical measurements except for flexion and supination improved after surgery (Table 1). The PRWE registered a larger effect size 1 year after surgery than did the DASH (p < 0.05). The effect size and standardized response mean of outcome measures were as follows (Table 2): PRWE (1.51, 1.64), DASH (1.12, 1.24), grip strength (0.59, 0.68), wrist pronation (0.33, 0.41), and wrist extension (0.28, 0.36).

Table 1 Preoperative and postoperative 12-month outcomes of each instrument*
Table 2 Effect sizes and standardized response means of each outcome measure

The minimum detectable changes for the PRWE and DASH were 7.7 and 9.3 points, respectively. The MCID was found to be 17 points for the PRWE (area under the curve [AUC] = 0.86; 95% CI, 0.71–0.99) and 13.5 points for the DASH (AUC = 0.81; 95% CI, 0.61–0.98). The PRWE (p < 0.01) and DASH scores (p = 0.01) were found to be capable of differentiating improved and unimproved patients (Fig. 1).

Fig. 1
figure 1

The ROC curves for the PRWE and DASH scores are shown. The area under the curve was 0.86 for the PRWE and 0.81 for the DASH.

In terms of the transition item used to anchor our MCID analysis, 19 of 31 patients reported a much improved wrist condition, five a slight improvement, five no difference, and two a poorer wrist condition. The mean changes of the PRWE and DASH scores were significantly greater according to the patient’s perceived improvement as follows: (much improved > slightly improved > not different > worsened) for the anchor question (p < 0.01). A linear regression test showed that there was significant association between patient-perceived improvement and PRWE (r2 = 0.46, p < 0.01) and DASH (r2 = 0.36, p < 0.01) score changes (Table 3).

Table 3 Change in DASH and PRWE scores stratified by response to transition item*

Discussion

Statistically significant improvements in outcome scores are not necessarily clinically important, or even meaningful [14]. For example, given a sufficiently large sample size even small differences can achieve statistical significance. As surgeons, we need to determine whether a treatment effect is clinically important when making treatment decisions [14]; from this premise, the concept of the MCID developed [21]. The two different methods (anchor-based and distribution-based) are used to estimate MCIDs. The advantage of the anchor-based method is that it takes into account the concept of clinical change as informed by patients, and the main disadvantage of the distribution-based method is that it is based purely on statistical calculation [3]. Currently, there is no consensus regarding which is the better method [3], therefore, we described the anchor-based method (MCID) and distribution-based method (minimal detectable change) in this study. To our knowledge, MCIDs have not been reported for the PRWE or DASH instruments in the setting of an ulnar impaction syndrome. We found the MCID was 17.3 points for the PRWE and 13.8 points for the DASH and that these values had the ability to differentiate between patients who were improved or unimproved, and that the PRWE more sensitively differentiated patients who were improved from those who were unimproved. The minimum detectable changes for the PRWE and DASH were 7.7 and 9.3 points, respectively. To ensure that MCIDs are free of measurement errors, the MCIDs should be greater than the minimal detectable changes. In the current study, the MCIDs of the PRWE and the DASH were greater than their minimal detectable changes, suggesting that observed MCIDs represented true changes in clinical status.

Our study has several limitations. First, we used an anchor-based method to determine the MCID, which is open to criticism [14]. The rating scale based on four possible answers for our anchor question is arbitrary. The anchor question may be affected by recall bias [30]. It relies on the validity and reliability of the chosen anchor question [3, 27]. It does not take into account the variation with the sample nor the measurement precision of the instrument [16]. Nevertheless, strong correlation and association were found between patient-perceived improvements and PRWE and DASH score changes, which supports validity of the chosen anchor question in our study. In addition, we measured minimal detectable changes (distribution-based MCID) to overcome the limitations of anchor-based MCIDs. Second, we used only region-specific outcome measures and we did not use a generic health-related quality-of-life instrument to assess mental or general health. Further studies, including a generic health-related quality-of-life instrument, would provide more information for comparison of outcome measures after treatment for ulnar impaction syndrome. Finally, although idiopathic ulnar impaction syndrome is often bilateral, we assessed grip strength and wrist ROM in a manner of percentage comparing both wrists despite having no detailed information for the contralateral wrist.

This study showed that the PRWE and DASH have a large effect size (more than 0.8) and that the PRWE is more sensitive than the DASH for detecting clinical change following ulnar shortening osteotomy for idiopathic ulnar impaction syndrome. Although grip strength and wrist extension and pronation were substantially improved after surgery, the extent of these changes was only small to moderate. In fact, wrist flexion and supination were not substantially improved after surgery. These findings are consistent with those of previous studies [19, 25]. Kitzinger et al. [19] reported that the DASH and grip strength improved but that wrist ROM was unchanged after ulnar shortening osteotomy for ulnar impaction syndrome, whereas Moermans et al. [25] found that the DASH and wrist ROM were improved but grip strength was unchanged. These inconsistencies between reports are presumably the result of the minor improvements observed in terms of grip strength and wrist ROMs after surgery. Our study also showed that mean preoperative wrist ROM was approximately 90% of the contralateral side and mean grip strength was approximately 81%, which left little scope for improvement.

The DASH is a patient-reported questionnaire that was designed to quantify the outcomes of upper extremity-specific injury or disease. In two studies, DASH scores were used as outcome measures after ulnar shortening osteotomy for ulnar impaction syndrome [19, 25]. The PRWE system was designed to quantify the outcomes of wrist-specific injuries or disease and has been widely used as an outcome measure for distal radius fractures [1, 12]. The majority of items (70%, 21 of 30 items) in the DASH concern functional disabilities and only six of the 30 (20%) address symptoms of the hand, arm, or shoulder. However, the PRWE scoring system allocates 50 of 100 points (50%) to wrist pain and the other 50 points to functional disabilities. Thus, our finding that the PRWE is a somewhat more sensitive tool in detecting clinical changes than the DASH probably reflects that the main problem of ulnar impaction syndrome is wrist pain rather than functional disability. However, to the best of our knowledge, the PRWE has rarely been used to assess the outcome of diseases related to the distal radioulnar joint such as ulnar impaction syndrome. Therefore, we recommend the evaluation of not only the DASH questionnaire but also the PRWE questionnaire in assessing the clinical status of distal radioulnar joint problems including ulnar impaction syndrome.