Introduction

The US healthcare system is beginning to shift from a fee-for-service to a value-based model, in which value is defined as an improvement in an outcome of interest per healthcare dollar spent [20]. Consequently, there is increasing attention on factors associated with adverse events or unplanned readmission. These represent an increased financial burden and avoidable use of limited resources [4, 13, 24]. Alternative payment models such as Medicare’s Bundled Payment for Care Improvement initiative and Comprehensive Care for Joint Replacement, which provide fixed reimbursement for an entire episode of care (such as THA and TKA and, more recently, hip fracture), are helping to align payer and provider financial incentives toward minimizing complications [9].

Surgeons are responsible for understanding the risk factors that lead to adverse events and unplanned readmissions, ensuring that modifiable factors are optimized before surgery and identifying factors that make discretionary surgery unwise. Studies have assessed the impact of risk factors on patient adverse events and unplanned readmission after hip and knee arthroplasties [7, 8, 19, 23], elbow arthroplasty [10, 17], and shoulder arthroplasty [3, 18, 28], but surgeons seem to use their judgment rather than screening tools based on evidence [6, 25]. A study comparing intuitive risk factors with those identified by an analysis of a large database might help surgeons understand the limits of expert impressions and daily judgment. An awareness of the limits of expert judgment could make a preoperative screening algorithm for appropriateness and risk modification an appealing part of perioperative care.

We therefore asked: (1) Does a statistically driven model better explain the variation in unplanned readmission within 30 days of discharge when compared with an a priori five-variable model selected based on expert orthopaedic surgeon opinion? (2) Does a statistically driven model better explain the variation in adverse events within 30 days of discharge when compared with an a priori five-variable model selected based on expert orthopaedic surgeon opinion?

Materials and Methods

Data from the American College of Surgeons National Surgical Quality Improvement Program (ACS NSQIP) from 2011 to 2014 was used in this study [1]. The annual data sets document all inpatient surgical procedures, of which orthopaedic surgery is a subset. Currently, 765 US hospitals participate in the ACS NSQIP [2]. Current Procedural Terminology (CPT) codes were used to identify adult (aged 18 years and older) individuals within the data set who underwent primary total shoulder arthroplasty (TSA) or reverse shoulder arthroplasty (CPT 23472) and had a diagnosis of osteoarthritis (CPT 715.XX) as the primary etiology. Patients with revision shoulder arthroplasty were excluded. The NSQIP was chosen because its data are based on expert review of the medical record rather than from administrative or billing records.

A total of 4030 patients met the initial inclusion criteria. There was a notable number of patients who were infirm or had a major metabolic deviation that would seem to make discretionary surgery unwise. Appropriateness in patient selection is a key aspect of safe and effective discretionary surgery. By limiting the analysis to patients meeting medical appropriateness criteria, we obtain data more applicable to a preoperative assessment protocol rather than just documenting the expected results of operating on patients who are infirm or unstable. Based on the following criteria that we felt would make a substantial, discretionary, quality-of-life surgery such as TSA unwise, 870 of 4030 patients (22%) were excluded from the data set: American Society of Anesthesiologists (ASA) physical status classification 4 (n = 91); inpatient transfer from another facility (n = 20); dyspnea with mild exertion (n = 245); dyspnea at rest (n = 14); open wound or infection (n = 22); need for transfusion preoperatively (n = 5); congestive heart failure (n = 13); disseminated cancer (n = 2); dialysis-dependent (n = 19); recent weight loss (> 10% in the last 6 months) (n = 6); renal failure (n = 4); functional status (fully dependent) (n = 3); high white blood cell count (> 10,000/μL) (n = 252); low hematocrit (< 30%) (n = 30); high bilirubin (> 1.9 mg/dL) (n = 6); low albumin (< 3.4 g/dL) (n = 60); systemic inflammatory response syndrome (n = 0); sepsis (n = 0); septic shock (n = 0); low sodium (< 135 mEq/L) (n = 219); high sodium (> 145 mEq/L) (n = 43); wound classification: contaminated or dirty/infected (n = 15); and ascites (n = 1). Patients may have had more than one risk factor that made discretionary surgery unwise.

We identified 3160 surgical procedures matching the full inclusion criteria. The average patient age was 69 years (SD, 9.5 years) with women accounting for just over half of the patients (51%) (Table 1). The average body mass index (BMI) was 31 kg/m2 (SD, 6.7 kg/m2). Nearly two-thirds of the patients (65%) had hypertension requiring medications.

Table 1 Patient characteristics, 2011–2014

Statistical Analysis

The two dependent variables of interest were (1) unplanned readmission and (2) adverse event within 30 days of discharge after a TSA. Because our dependent variables were dichotomous, bivariate logistic regression was conducted for all independent variables (the risk factors we considered) to determine odds ratios and significance.

After the bivariate analyses, four risk-adjustment models were developed for this study. Two models were created with the data containing all patients and two models were created with the patients unfit for elective surgery excluded. In each set of models, one was based off of an a priori selection of five risk factors that a panel of orthopaedic surgeons came to consensus on. The five risk factors selected for the all patients clinically driven model were: age, ASA classification ≥ 3, BMI, present smoker, and diabetes mellitus. The five risk factors selected for the appropriate patient clinically driven model were: age, ASA classification 3, BMI, present smoker, and diabetes mellitus. Each full model used the results of bivariate logistic regression to determine which risk factors were to be included in the multivariable regression (Table 2). Variables with odds ratios that were significant at the p < 0.10 level were included. For all four models, pseudo R2 values were reported as well as for each risk factor. The area under the receiver operating characteristic curve (c-statistic) and Hosmer-Lemeshow statistic were calculated to ensure appropriate model performance. A c-statistic of 0.5 indicates the model is no better than chance, whereas a c-statistic of 1.0 indicates the model perfectly predicts the outcome.

Table 2 Bivariate analysis of risk factors for adverse events or unplanned readmission

A post hoc power analysis based on a chi square test determined that 3160 patients provided 99% power to detect a variable explaining 30% of the variability in adverse events or unplanned readmission with an α of 0.05.

Results

The statistically driven model (pseudo R2 = 0.046) better explained the variation in unplanned readmission within 30 days of discharge after a TSA compared with the clinically driven model (pseudo R2 = 0.014) (Table 3). The statistically driven model included eight factors: operating time (hours) (odds ratio [OR], 1.26; 95% confidence interval [CI], 1.04–1.53); hypertension requiring medications (OR, 1.95; 95% CI, 1.01–3.76); age (OR, 1.02; 95% CI, 0.99–1.05); men (OR, 1.60; 95% CI, 0.94–2.71); ASA classification 3 (OR, 1.21; 95% CI, 0.71–2.04); high blood urea nitrogen (> 30 mg/dL) (OR, 2.13; 95% CI, 0.78–5.77); high creatinine (1.3 mg/dL) (OR, 1.30; 95% CI, 0.54–3.16); and low platelets (< 150,000/μL) (OR, 2.14; 95% CI, 0.98–4.65). The statistically driven model performed better than the clinically driven model for unplanned readmission (c-statistic: 0.64 versus 0.61). The statistically driven model with all patients (pseudo R2 = 0.046; c-statistic, 0.68), including those deemed unfit for elective surgery, explained more of the variation and performed better than the clinically driven model (pseudo R2 = 0.018; c-statistic, 0.63) (Appendix 1 [Supplemental materials are available with the online version of CORR ®.]). In the clinically driven all patients model, age (OR, 1.03; 95% CI, 1.01–1.06) and ASA classification ≥ 3 (OR, 1.68; 95% CI, 1.07–2.63) were significantly associated with unplanned readmission. In the statistically driven all patients model, high sodium (> 145 mEq/L) (OR, 6.73; 95% CI, 2.53–17.93), age (OR, 1.03; 95% CI, 1.01–1.05), operating time (hours) (OR, 1.25; 95% CI, 1.03–1.51), and low sodium (< 135 mEq/L) (OR, 2.01; 95% CI, 1.02–3.98) were associated with unplanned readmission.

Table 3 Factors independently associated with an unplanned readmission without inappropriate patients

The statistically driven model (pseudo R2 = 0.033) better explained the variation in adverse events within 30 days postdischarge than the clinically driven model (pseudo R2 = 0.0095). The statistically driven model included six factors: age (OR, 1.03; 95% CI, 1.01–1.06); men (OR, 1.64; 95% CI, 1.05–2.57); operating time (hours) (OR, 1.27; 95% CI, 1.07–1.52); high blood urea nitrogen (> 30 mg/dL) (OR, 3.12; 95% CI, 1.35–7.21); bleeding disorder (OR, 2.25; 95% CI, 0.87–5.82); and high creatinine (1.3 mg/dL) (OR, 0.98; 95% CI, 0.42–2.26). The statistically driven model with all patients (pseudo R2 = 0.061; c-statistic, 0.69), including those deemed unfit for elective surgery, explained more of the variation and performed better than the clinically driven model (pseudo R2 = 0.017; c-statistic, 0.62) (Appendix 2 [Supplemental materials are available with the online version of CORR ®.]). In the clinically driven all patients model, age (OR, 1.04; 95% CI, 1.01–1.06) was significantly associated with adverse events. In the statistically driven all patients model, high sodium (> 145 mEq/L) (OR, 8.02; 95% CI, 3.53–18.20), high blood urea nitrogen (> 30 mg/dL) (OR, 2.96; 95% CI, 1.50–5.82), age (OR, 1.03; 95% CI, 1.01–1.05), and operating time (hours) (OR, 1.25; 95% CI, 1.05–1.48) were associated with adverse events.

Discussion

There is growing incidence of upper extremity procedures in the United States [14]. As shown by Virani et al. [27], adverse events after TSA can cost an average of USD 14,676. In an era focused on improving surgical quality and decreasing costs, the importance of adequate risk stratification approaches in the selection and management of patients considering TSA is important. As noted by several authors, although many randomized controlled trials and retrospective case-control studies have identified a number of risk factors for complications after TSA, orthopaedic surgeons seem to utilize experience and clinical intuition rather than statistically driven models as a care-redesign strategy [21, 26]. To better understand the gap (if any) between risk stratification based on clinical experience versus data-driven models, we compared the predictive ability of an a priori determined risk model (total of five variables) based on expert orthopaedic surgeon opinion and a statistically driven model (based on a large, nationally representative data set) with respect to unplanned readmission and severe adverse events within 30 days postdischarge after TSA.

Our study results have some limitations. First, we utilized an a priori clinically driven model of only five variables. We believe these to be the most intuitive risk factors based on consensus of a panel of experienced orthopaedic surgeons; however, we acknowledge that not all orthopaedic surgeons would agree. Second, 22% of patients were inappropriate for TSA-based comorbidities that we felt made it unwise for the patient to have a discretionary surgical procedure. We omitted these patients because we felt a two-step process would better inform daily practice: (1) assess patients for appropriateness; and (2) assess risk factors among appropriate patients. It is important to note that even when all patients, including those generally deemed unfit for elective surgery, were included in clinically and statistically driven unplanned readmission and adverse event models, the statistically driven model explained more of the variation and performed better (Appendices 1 and 2). Third, this analysis cannot distinguish between standard TSA and reverse TSA. Fourth, we are limited in our analysis by the use of a large database and its accuracy. However, studies have shown that the ACS NSQIP database, which has its data gathered from patient medical charts, provides better accuracy than claims databases [15, 16].

Our analysis of 30-day unplanned readmission found that the statistically driven model (pseudo R2 = 0.046) explained more of the variation in unplanned readmissions than the clinically driven model (pseudo R2 = 0.014), although neither model explained much of the variation. Because neither model explained a great deal of the variation, our results suggest that additional factors are likely impacting unplanned readmission rates. This is important to keep in mind for surgeons who are looking to drastically decrease unplanned readmissions by analyzing the risk factors noted in our models. Although the clinically driven model found no risk factors for 30-day unplanned readmission, the statistical model identified operating time and hypertension requiring medications as independent predictors, both of which are potentially modifiable provider and patient risk factors, respectively. Several studies have reported on risk factors for unplanned readmission after TSA (Table 4) [5, 11, 12, 22]. However, the results are inconsistent (eg, both men and women have been shown to have increased risk of unplanned readmission), and although hypertension confirms a previous known risk factor, others have not noted operating time as a risk factor for unplanned readmission. It is important to note that although our study suggests that upper extremity orthopaedic surgeons should minimize operative time, a confounding scenario in our analysis is that patients who sustained intraoperative complications had longer procedures and subsequent higher readmission risk. Furthermore, our analysis differs from those previously completed because patients unfit for discretionary TSA were removed from the analysis. Although the statistically driven model explained slightly more variation in unplanned readmission, neither explained the variation in unplanned readmission very well. Thus, additional factors not captured by the ACS NSQIP database are likely playing a major role in patients who have unplanned readmission after TSA. However, the fact that the statistically driven model explained more variation and performed better suggests that surgeons should be aware that their clinical intuition may not be as accurate as they expect.

Table 4 Factors independently associated with an adverse event without inappropriate patients

The clinically driven adverse event model (pseudo R2 = 0.0095) explained less of the variation in adverse events than the statistically driven model (pseudo R2 = 0.033), although neither model explained a great deal of variation. Many studies have addressed the risk factors for adverse events and found that age, operating time, and men were correlated with increased risk of adverse events [3, 12, 28, 29]. Although we also found that men were more likely to have adverse events after a TSA, a previous study found that sex was not correlated with adverse events [10]. Similar to the unplanned readmission analysis, our study differs from others because of our exclusion of patients unfit for elective TSA. Our study further reinforces that statistically driven models explain greater variation in adverse events and perform better than clinically driven models. Because of this finding, surgeons should remain informed of adverse event risk factors determined by large database analyses and consider them in their practices. However, neither the clinically driven model nor statistically driven model explained a great deal of the variation in adverse events. Thus, surgeons should take note that additional risk factors not available in the ACS NSQIP database may play a large role in explaining the variation in adverse events.

Our work reinforces the value of using large databases to estimate risk in addition to clinical intuition. Although neither the clinically derived model nor statistically derived models explain a large amount of the variation in unplanned readmission or adverse events, the statistically driven models still explain more variation and performed better (ie, higher c-statistic). Surgeons could benefit from considering risk factors determined in our analyses and those conducted with other large data sets when planning TSAs. Perhaps most important is the fact that 22% of patients undergoing TSAs were excluded because of what we would consider contraindications to major discretionary surgery for osteoarthritis. Indeed, models with all patients, including those with risk factors making TSA unwise, explained greater variation and performed better than all other models. However, this is expected given that such risk factors are known to be detrimental to unplanned readmission and adverse event rates. Further exploration could seek to better understand when such surgical interventions are being performed. Checklists and screening procedures can emphasize that discretionary surgery is best for healthy, low-risk patients and provide all patients with opportunities for increased comfort and function including improved mood and resiliency. Patients with major medical risk might think surgery is their only hope and be willing to take substantial risks, but it is important that the care team ensure that such a determination is not based on common misconceptions or that it is the expression of stress or distress that will not be well addressed by surgery. In any case, keeping high-risk, arguably inappropriate patients in a risk analysis might produce risk calculators that do not apply as well to the type of patient who is most appropriate for discretionary surgery.