Introduction

Substantial efforts have been directed at developing outcome instruments that adequately characterize the outcomes of orthopaedic interventions, including physician-directed and patient self-evaluation [13, 16]. The importance of simple, reliable, and validated self-assessed outcome measures to evaluate patient function and to show satisfactory results after surgery has continued to increase as greater emphasis is placed on quality, outcomes, and patient satisfaction [1, 15]. One challenge of using such outcome instruments is adequately determining the clinical relevance of any change in score during the course of treatment. Statistical significance means that the differences found are not likely to have been the result of chance, but depending on sample size and some other parameters, differences that are trivial in size—clinically meaningless or even imperceptible to patients—might be statistically significant [17, 18]. A clinically relevant change, which can be defined as the minimal clinically important difference (MCID), is defined as the “smallest difference in score in the domain of interest which patients perceive as beneficial” [4]. The MCID can be calculated several ways, the most common of which are an anchor-based approach and a receiver operator curve (ROC) analysis. The anchor-based method compares changes in scores with a patient’s retrospective rating of change in status from baseline (“anchor question”); the ROC method allows choosing the threshold that is the best compromise between sensitivity and specificity for the outcome measure [17]. The anchor-based approach is the easiest and most widely used technique to specify a range of anchor-instrument results and calculate the change in outcome score that correlates with those anchors [17].

The American Shoulder and Elbow Surgeons (ASES) questionnaire was developed in 1994 to provide a standardized method for evaluating shoulder function [12]. It has been found to be reliable and valid for various populations including patients with nonoperative and operatively treated shoulder disease [6, 7]. The initial study, which showed an MCID for the ASES score of 6.4 points on a 100-point scale (higher scores represent better results), derived its results from an extremely heterogenous population of 63 patients, including patients who were treated nonoperatively and operatively with at least eight different shoulder-related diagnoses [7]. A more-recent study established the MCID of 12–17 points for the ASES in a more homogenous population of patients who underwent nonoperative treatment of rotator cuff disease [17].

The ASES score, however, is used frequently with surgical patients, including those undergoing total shoulder arthroplasty (TSA). This represents a distinct population from which an MCID has been established [11]. The clinical diagnosis and treatment rendered have a substantial effect on the responsiveness of an outcome measure, and thus it is critical that measures of responsiveness such as the minimal clinically important change be determined in a population of patients undergoing similar treatment [2, 8, 9, 14, 17]. Currently, no studies exist, to our knowledge, which establish the clinically relevant change in the ASES after shoulder arthroplasty, including anatomic TSA or reverse TSA (RSA).

Therefore, we asked: (1) What are the MCID and SCB for the ASES score after primary TSA and RSA? (2) Are the MCID and SCB for the ASES score different between TSA and RSA? (3) What patient-related factors are associated with achieving the MCID and SCB after TSA and RSA?

Methods

Study Design and Setting

Institutional review board approval was obtained before proceeding with the study. A longitudinally maintained institutional shoulder arthroplasty registry was retrospectively queried for patients who underwent primary shoulder arthroplasty, including conventional TSA or RSA from 2007 to 2013.

Participants/Study Subjects

Patients undergoing TSA or RSA for a diagnosis of glenohumeral arthritis or rotator cuff tear arthropathy, respectively, were included (n = 827). All patients had baseline ASES scores obtained at the time of entry in the registry. Patients undergoing revision procedures (n = 33), those without 2-year satisfaction scores (n = 13), those without 2-year SF-12 scores (n = 3), and those with missing 2-year followup ASES scores (n = 288) were excluded from analysis. This resulted in a final cohort of 490 patients of 794 eligible patients (62%). A comparison of baseline variables between the included patients (n = 490) and patients excluded for lack of 2-year followup data (n = 304) was done (Table 1).

Table 1 Comparison of study cohort and excluded patients

Study Population Demographics

The mean age of the overall study population, including patients who had TSA and RSA, was 68 ± 9 years. Two hundred thirty-three (48%) patients were female. The mean BMI was 28 ± 6 kg/m2. The majority of patients were classified as being in American Society of Anesthesiologists (ASA) Class II (363 patients, 74%). Sixteen patients (3%) were classified as being in Class I, and the remaining patients (111, 23%) were in Class III or higher. Four hundred twenty-three patients (86%) had a diagnosis of osteoarthritis, and the remaining 67 (14%) had a diagnosis of rotator cuff tear arthropathy (Table 1). The mean age of the 304 patients excluded because of incomplete records was 67 ± 10 years. One hundred twenty-five (41%) patients were female. The mean BMI was 29 ± 6 kg/m2. The majority of patients were classified as being in ASA Class II (208 patients, 68%). Seven patients (2%) were classified as being in Class I, and 73 (24%) were in Class III, the remaining 16 (6%) did not have ASA recorded (Table 1).

Outcomes

The outcomes of interest were the MCID and SCB of the ASES score for patients undergoing primary TSA and RSA. In addition to an ASES score, patients enrolled in the arthroplasty registry complete an eight-question satisfaction survey and SF-12 health survey preoperatively and at the 2-year followup. These patient-reported outcome measures (PROMs) were used as anchors to determine the amount of relevant clinical change. The satisfaction survey includes the question: “How satisfied are you with the results of your shoulder surgery in the following areas?” This overhead question targets satisfaction “for improving your ability to do housework or yard work” (Work), “for improving your ability to do recreational activities” (Activities), and “overall, how satisfied are you with the results of your shoulder surgery?” (Overall). There were five answer options: “very satisfied”, “somewhat satisfied”, “no change”, “somewhat dissatisfied”, and “very dissatisfied”. In the SF-12 health survey, only the moderate activities question: “Does your health now limit you in moving a table, pushing a vacuum cleaner, bowling, or playing golf” was used. There were three answer options: “yes, limited a lot”, “yes, limited a little”, and “no, not limited at all”.

The MCID was calculated using an anchor-based method that anchors the change in ASES score from baseline to 2-year followup to satisfaction scores and the SF survey score [5, 10]. The anchor-based MCID is obtained by subtracting the mean change of ASES score of those reporting “no change” or “somewhat dissatisfied” from the mean change of ASES score of those reporting “somewhat satisfied”. Patients who reported they were “very dissatisfied” were not included in the analysis, as they did not represent minimal change or no change/slight worsening. The anchor-based substantial clinical benefit was calculated similarly by subtracting the mean change of ASES score of those reporting “no change” or “somewhat dissatisfied” from the mean change of ASES score of those reporting “very satisfied”. In other words, patients reportedly being “somewhat satisfied” were deemed to have minimal improvement, whereas patients reportedly being “very satisfied” experienced substantial improvement. Those reporting “no change” or “somewhat dissatisfied” were considered to have experienced no change or improvement [17]. The MCID and SCB were calculated for patients having TSAs and RSAs in isolation and as a combined cohort.

Statistical Analysis

Univariate analyses of patient-related factors, including demographic and clinical characteristics, were performed using logistic regression with the single risk factor of interest. Multivariate logistic regression of patient-related factors that influence MCID and SCB achievement were completed using forward stepwise selection of the risk factors found to have a probability less than 0.100 from the univariate analyses. For regression analysis, a probability less than 0.05 was considered statistically significant. Only the MCID and SCB achievement anchored to overall satisfaction were used as the outcome in the multivariable analysis as it was the most shoulder-specific anchor question.

Results

The MCID of the ASES for the combined cohort ranged from 6.3 (95% CI, −2.3 to 15.0) to 13.5 (95% CI, 4.8–22.3) across all four anchors; the SCB ranged from 12.0 (95% CI, 6.0–18.0) to 36.6 (95% CI, 29.1–44.1) (Table 2). The MCID of the ASES for the TSA cohort ranged from 3.1 (95% CI, −4.9 to 11.1) to 16.1 (95% CI, 5.4–26.7) across all four anchors; the SCB ranged from 7.4 (95% CI, −0.2 to 15.1) to 37.4 (95% CI, 28.6–46.3) (Table 3). The MCID of the ASES for the RSA cohort ranged from 6.2 (95% CI, −6.3 to 18.7) to 13.9 (95% CI, 3.5–24.2) across all four anchors; the SCB ranged from 14.3 (95% CI, 3.6–25.1) to 32.1 (95% CI, 26.9–37.2) (Table 4).

Table 2 MCID and SCB calculations for TSA and RSA combined
Table 3 MCID and SCB calculations for TSA
Table 4 MCID and SCB calculations for RSA

There were no differences in the MCID of the ASES score between conventional TSA and reverse TSA for the “work satisfaction” anchor (6.3 ± 5.7 versus 6.2 ± 6.4; mean difference, −0.1; 95% CI, −19.1 to 19.3; p = 0.992), “activity satisfaction” anchor (9.0 ± 4.9 versus 8.9 ± 6.3; mean difference, 0.1; 95% CI, −17.3 to 17.5; p = 0.991), “overall satisfaction” anchor (16.1 ± 5.4 versus 8.4 ± 2.9; mean difference, 7.7; 95% CI, −11.1 to 26.6; p = 0.418) and “SF-12 moderate activities” anchor (3.1 ± 4.1 versus 13.9 ± 5.3; mean difference, −10.8; 95% CI, −25.8 to 4.2; p = 0.159). There also were no differences in the SCB of the ASES score between conventional TSA and reverse TSA for the “work satisfaction” anchor (21.6 ± 4.8 versus 19.6 ± 5.1; mean difference, 2.0; 95% CI, −20.7 to 24.7; p = 0.862), “activity satisfaction” anchor (19.2 ± 4.1 versus 18.9 ± 5.8; mean difference, 0.3; 95% CI, −19.8 to 20.4; p = 0.977), “overall satisfaction” anchor (37.4 ± 4.5 versus 32.1 ± 2.6; mean difference, 5.3; 95% CI, −15.4 to 26.0; p = 0.614) and “SF-12 moderate activities” anchor (7.4 ± 3.9 versus 14.3 ± 5.5; mean difference, −6.9; 95% CI, −25.5 to 11.7; p = 0.467).

For the overall satisfaction anchor in combined TSA and RSA, higher preoperative ASES score (odds ratio [OR], 0.96; 95% CI, 0.94–0.98; p < 0.001) and undergoing RSA compared with TSA (OR, 0.36; 95% CI, 0.16–0.85; p = 0.016) remained independent predictors of not achieving an MCID for the ASES after shoulder arthroplasty in multivariate analysis (Table 5). Higher preoperative ASES score (OR, 0.91; 95% CI, 0.94-0.98; p < 0.001), a diagnosis of rotator cuff tear arthropathy (OR, 0.14; 95% CI, 0.07-0.30; p < 0.001), living alone (OR, 0.36; 95% CI, 0.19-0.69; p = 0.002), and a comorbidity of back pain (OR, 0.42; 95% CI, 0.24-0.71; p = 0.002) were independent predictors of not achieving SCB for the ASES after shoulder arthroplasty (Table 5). Preliminary univariate analysis for the overall satisfaction anchor also included BMI, high school education only, a diagnosis of cuff tear arthropathy, and diabetes as substantial risk factors for not achieving the MCID, but these factors dropped out in multivariate analysis (Table 6). Preliminary univariate analysis for the overall satisfaction anchor also included a torn rotator cuff, lung disease, diabetes, and a general diagnosis of osteoarthritis as substantial risk factors for not achieving SCB, but these factors dropped out in multivariate analysis (Table 7).

Table 5 Multivariate analysis of patient-related factors that influence achievement of MCID and SCB
Table 6 Univariate analyses of patient-related factors that influence MCID
Table 7 Univariate analyses of patient-related factors that influence SCB

Discussion

The ASES questionnaire was developed to provide a standardized method for evaluating shoulder function. Previous studies have determined the clinical responsiveness of this outcome measure for heterogenous populations or nonoperatively treated rotator cuff disease [7, 17]. Currently, to our knowledge, no studies exist which establish the clinically relevant change in the ASES score after shoulder arthroplasty. Therefore, we aimed to find the MCID and SCB for the ASES score after shoulder arthroplasty and to find patient factors that are associated with not achieving measurable improvement.

This study had numerous limitations. A major limitation of the study is the substantial loss to followup, as 39% of potential patients were excluded owing to incomplete followup. Patients who do not return for followup often have worse outcomes. We have shown that the included study patients and those lost to followup have similar baseline parameters, with the exception of a statistically, but not clinically significant difference in baseline ASES score. Second, the study used several satisfaction-based anchors. In general, satisfaction measures are not validated. The MCID and SCB based on the satisfaction anchors generally were concordant with the other metrics used in this study, thus improving the face validity.

Next, the study is a retrospective review of a longitudinally maintained database, and thus is subject to selection bias, transfer bias, and assessment bias. The patients included are from one institution, which introduces selection bias and reduces the generalizability of the findings to other settings; however, the surgeries were performed by a group of more than 20 different surgeons, which reduces this impact of this effect. As mentioned previously, we had a substantial loss to followup, which introduces transfer bias. The included and excluded patients had similar baseline demographics, which mitigates this effect somewhat, but regardless, the reported MCID and SCB actually could underestimate the true values owing to this followup loss. Finally, inclusion of nonvalidated metrics such as patient satisfaction introduces assessment bias. As mentioned previously, the satisfaction anchors generally were concordant with the other metrics used in this study, thus improving the face validity. The statistical associations described in the study should not be interpreted as causative, as there is the possibility for confounding variables in this type of analysis. However, we attempted to control for them using a robust multivariate logistic regression. Although our study population was large, factors that were not found to predict the achievement of an MCID or SCB could be attributable to lack of study power. This was a convenience sample from a registry database, thus we could not add any more patients to improve our power. In particular, patients of nonwhite race, lower educational status, and certain medical comorbidities were not present in high numbers, highlighting the importance of future multicenter studies to explore the effect of these factors on clinical improvement after shoulder arthroplasty.

We determined the ASES for a population of patients who underwent shoulder arthroplasty and found an MCID between those reported in previous studies for other shoulder disorders [7, 17]. We also report the substantial clinical benefit which, to our knowledge, has not been reported previously for the ASES. Two previous studies reported the MCID of the ASES score for different populations. Michener et al. [7] calculated the MCID and the minimal detectable change of the ASES in a population of 63 patients with shoulder dysfunction using a similar anchor-based method to a single question evaluating global rating of change. They reported a minimal detectable change of 9.7 ASES points and an MCID of 6.4 ASES points, concluding that the ASES was a responsive outcome tool for varied shoulder disorders. However, the study population was heterogenous, including patients treated nonoperatively and operatively with at least eight different shoulder-related diagnoses, including impingement, instability, rotator cuff disease, adhesive capsulitis, fracture, generalized weakness, and multiple types of previous surgery [7]. The study design used by Michener et al. [7] was improved in a more recent study by Tashjian et al. [17], in which the MCID of the ASES was determined for a homogenous population of 81 patients who were treated with nonoperative modalities for rotator cuff disease. Tashjian et al. [17] concluded that a 12- to 17-point change in the ASES score represented a minimal clinically important change for this study population. Our study advances on these prior investigations by using a population of patients who underwent shoulder arthroplasty and reporting the SCB in addition to the MCID.

We found no difference in the MCID or SCB between anatomic TSA and RSA across all four anchors for the indications in our cohorts, which is important when evaluating clinical results and critically evaluating outcomes in published series. No previous studies, to our knowledge, have compared the MCID of the ASES or other shoulder-specific scores between types of primary arthroplasty. Patients who undergo RSA, compared with those who have TSA, have a lower likelihood of achieving an MCID for the indications studied. Although the majority of patients who underwent RSA improved postoperatively, the improvement was not as robust as patients who underwent TSA in multivariate analysis.

Higher baseline ASES scores were associated with not achieving an MCID and SCB after shoulder arthroplasty. Risk factors for achievement of clinically important change after shoulder arthroplasty have not been reported using the ASES as an outcome measure, but in one prior study, this question was investigated using the Simple Shoulder Test for patients who underwent RSA [3]. For the 74 patients who underwent RSAs for massive rotator cuff tears who were included in the study, Hartzler et al. [3] found that young age (< 60 years), preoperative upper extremity neurologic dysfunction, and high preoperative function as evidenced by a high preoperative Simple Shoulder Test, were associated with failure to achieve an MCID postoperatively. Using the ASES score as an endpoint, we found similarly that high preoperative scores portend a lower likelihood of achieving an MCID and SCB after shoulder arthroplasty. This finding is important for surgical decision-making and for counseling patients before shoulder arthroplasty, as patients with high levels of preoperative function may not experience as marked a benefit from the procedure as patients with lower levels of reported functioning. Our study was not designed to determine a threshold level of preoperative ASES score above which shoulder arthroplasty might not be of substantial clinical benefit to a patient, but this would be a valuable topic for further investigation.

Patients with glenohumeral arthritis or rotator cuff tear arthropathy who undergo primary conventional TSA or RSA and have at least a nine-point improvement in their ASES score experience a clinically important change, whereas patients who have at least a 23-point improvement in their ASES score experience a substantial clinical benefit. High preoperative function was associated with a decreased likelihood of achieving a clinically important change after TSA.