INTRODUCTION

The Hospital Readmissions Reduction Program reduces Medicare payments to hospitals with excess readmission, and thus, there is great urgency to identify patients at high risk to be readmitted to the hospital.1 However, predicting which patients are likely to be readmitted to the hospital remains a challenge and most current readmissions risk prediction models have room for improvement.2,3,4 Models based on patient-level factors such as demographics and medical comorbidities perform better to predict mortality than readmission risk, while models incorporating other factors such as social support and functional status are better in predicting readmission risk.2 Unfortunately, given that they are not available in administrative databases, the use of the latter measures in readmission models remains limited.

Prior studies using data from the linked Health and Retirement Study and Medicare data have shown that dependency in activities of daily living (ADLs) is associated with increased risk of hospital readmissions in older adults on Medicare.5, 6 Studies have also documented the association between re-hospitalization and social factors, including lack of social support and living arrangement,7,8,9 as well as psychiatric conditions7 and cognitive impairment.10

The association between instrumental activities of daily living (IADL) impairment and readmission is less established. IADLs represent abilities that make it possible for an individual to live independently in the community, as opposed to ADLs, which represent more basic functioning. IADL limitations include difficulties moving around the community, managing money, preparing meals, shopping for groceries and other necessities, and taking prescribed medications. Though the link between cognitive impairment, self-management limitations,11 and dependency in IADLs12 has been established in heart failure patients, rarely have hospital readmissions been examined in the context of dependency in IADLs. Previous studies have yielded mixed results. Arbaje et al.13 reported that the post-discharge environment, including having unmet functional needs with ADLs and IADLs, as well as lacking self-management skills, was associated with a greater likelihood to be readmitted to the hospital. On the other hand, Greysen et al.5 did not find an association between IADL limitations and hospital readmissions after adjusting for covariates.

Self-management is a core element in the chronic care model14 and is a key strategy for treating chronic illness and managing care transitions, such as the vulnerable period after hospital discharge.15Self-management entails being able to perform different types of tasks, including but not limited to managing oral medications, symptom recognition, and communicating with the care team, and is critical for navigating the post-discharge period. Failure to perform these important tasks, which require IADL capacity, can be at the source of dysfunction that leads to hospital readmission. If difficulty with IADLs were found to be associated with hospital readmission, screening for deficiencies and providing supportive or corrective services would make IADL deficits a modifiable and actionable risk factor.

The main objective of this study was to test the importance of IADL dependency as a predictor of hospital readmissions relative to other patient characteristics and health conditions using a machine learning approach.16,17,18

METHODS

This is a retrospective cross-sectional study using a nationally representative panel survey linked to administrative claims data and received IRB approval from the corresponding author’s home institution (IRB protocol # 2014-781).

Data Sources

The study used the 2002, 2004, 2006, 2008, and 2010 waves of the Health and Retirement Study (HRS) and the linked CMS-Medicare claims over the period from 2002 to 2011. The HRS is a nationally representative biennial longitudinal panel survey of individuals age 50 and older in the USA that collects information through self-report on topics including health status, chronic disease, cognitive ability, and physical functioning.19 African-Americans, Hispanics, and Floridians are oversampled. Proxy respondents (typically a spouse) provide information when a focal respondent is unable to be interviewed. Over 80% of Medicare-enrolled HRS respondents consented to record linkage.

Study Population

The study population included HRS respondents age 65 and older with linked fee-for-service Medicare claims. The unit of analysis for this study was each hospitalization (n = 20,007) that occurred during the study period, from 6617 unique people. Hospitalizations where the person died in-hospital or within 30 days of discharge, or left against medical advice, were excluded. For hospitalizations involving transfers, only the final discharge hospital was included in the analysis.

Dependent Outcome Variable

The main outcome was a 30-day readmission, which includes all-cause unplanned admissions within 30 days of a previous hospital discharge. ED visits and observation stays were not considered to be readmissions, consistent with the CMS definition of 30-day all-cause readmission.

Independent Predictor Variables

The main predictor variable of interest was limitations in instrumental activities of daily living (IADLs), defined as having limitations in one or more of the following activities: managing money, shopping for groceries or other necessities, preparing meals, using the telephone or other communication, or managing medications.

Other covariates were those related to complex multimorbidity defined as the presence of chronic conditions, functional limitations, and/or geriatric syndromes.20 For chronic conditions, respondents were asked if they were ever told by a physician that they had hypertension, heart disease, lung disease (COPD), diabetes, stroke, arthritis, cancer, or psychiatric conditions. Additional questions were used to assess severity.16 Other functional limitations besides IADL included limitations in strength, upper-body mobility, lower-body mobility, and activities of daily living (ADL).21 Geriatric syndromes included hearing impairment (with use of hearing aid), vision impairment (with use of corrective lenses), cognitive impairment, urinary incontinence, moderate to severe depressive symptoms, and severe pain.22

Additional covariates included the following: 5-year age group, race/ethnicity, sex, household income as a ratio of the federal poverty level, years of education, marital status, BMI, smoking, alcohol use, and dual enrollment in both Medicare and Medicaid. Body mass index (BMI, measured as kg/m2) was characterized as underweight (BMI ≤ 18), normal/overweight (BMI of 18.5–30), and obese (BMI ≥ 30); in addition, 2% of respondents had missing values for BMI, which was treated as a category. All predictor variables were based on survey responses to the HRS, which occurred in the survey wave immediately prior to the hospitalization.

Statistical Analysis

Several different machine learning methods were used to accomplish the objectives of this study, the details of which are described below. The outcome for all models was 30-day readmission and all exposure/covariate variables listed above were included as candidate predictor variables in every model unless otherwise specified. We used a random sample of 80% of our data to train each model23, while the remaining 20% was used to test the predictive ability in terms of the area under the receiver operating curve (AUC). R version 3.6.1 and SAS v 9.4 were used for the analysis.

Random forest analysis was used to measure the importance of each variable in terms of the amount of information it provides in predicting the outcome of interest. The random forest algorithm is a method that creates and aggregates multiple classification trees using random variable selection and bootstrap sampling, a detailed description of which is provided by Breiman.24 Random forest takes a random sample with replacement for each tree it creates and uses the observations not selected to measure the prediction error. We created 20,000 trees and under-sampled the non-readmissions, so that each bootstrap sample had a 50:50 balance between readmitted and non-readmitted, which improves the performance of random forest models over imbalanced data.25 The mean decrease in accuracy was used to rank each variable’s importance in predicting the readmission outcome correctly. Two versions of the random forest model were conducted—the main analysis with broad categories of functional ability (IADL, ADL, strength, lower mobility, and upper mobility limitations) as binary predictors and a sensitivity analysis with each functional ability subcomponents (e.g., difficulty managing money) as individual predictors in the model.

Classification and regression tree (CART) analysis using conditional inference was used to visually identify specific combinations of conditions that were associated with high (and low) risk of readmission.26 CART is a non-parametric, machine learning method that repeatedly splits the data into binary partitions based on the values of explanatory variables so that each partition corresponds to as homogenous outcome as possible. A generalized linear model version of CART was used to account for correlated observations due to repeated measures.27 Branches of the tree that did not significantly improve the model as measured by Akaike information criterion were removed or “pruned” to reduce model complexity.

Modified Poisson regression analysis was used to estimate the risk ratio of readmission for those with IADL limitations after adjusting for other covariates.28 The generalized estimating equation (GEE) approach was used to account for a possible correlation due to multiple hospitalizations per person.

RESULTS

There were 20,007 hospitalizations and 3281 (16.4%) of those resulted in a readmission to the hospital within 30 days (Table 1). There were 6617 unique subjects in the study period: 4310 had multiple hospitalizations, and 1871 people had a readmission (Table 2). The percentage rates of hospitalizations with patient ADL and IADL limitations were 13.7% and 33.7%, respectively. Higher than average readmission rates were observed among discharges where the patients had IADL limitations (19.9%) and ADL limitations (21.2%). A full description of study variables is shown in the Appendix Table 1.

Table 1 Patient Characteristics of Hospitalizations and 30-Day Readmission
Table 2 Number of Hospitalizations and Readmissions per Subject over Study Period

The ranking of variables by their importance for predicting 30-day readmission based on the random forest analysis is shown in Figure 1. The dot plots rank the variables in descending order relative to the most important predictor. Panel a shows the most important predictor was ADL limitations followed by IADL limitations. Panel b shows the same analysis but with the subcomponents of each functional limitation composite measure. Diabetes, age, and cognition were the most important predictors. Difficulty managing money (an IADL) was the fourth most important variable, followed by difficulty crossing the room (an ADL). Other important functional limitations in the top ten were difficulty shopping for basic needs (an IADL, ranked 7th), difficulty reaching overhead (an upper mobility limitation, ranked 9th), and difficulty using the phone (an IADL, ranked 10th). Appendix Figures 1-3 show the random forest plots from various sensitivity analyses.

Figure 1
figure 1

Variable importance for predicting 30-day readmission from random forest analysis. Dot charts show the relative importance of each variable as a predictor of the outcome. The most important variable is the top one in each chart and is scaled to 100%. The importance of the rest of the variables is shown relative to this. Panel a is the main model with composite measures of ADL, IADL, strength, upper mobility, and lower mobility limitations. Panel b is the same model but, the functional limitations measures are “unpacked” so that each subcomponent limitation (e.g., Difficulty: Managing Money) is included as predictor variables instead.

The classification tree for 30-day all-cause readmission is shown in Figure 2. Each “path” down the tree represents a subgroup of hospitalizations with those patient characteristics, and the bars in the terminal nodes represent the percentage of hospitalizations in that subgroup that resulted in a readmission. For instance, 786 hospitalizations involved people with both IADL limitations and severe diabetes (farthest right branch), and 26% of those were readmitted. The highest count of readmissions occurred in hospitalizations of people with IADL limitations with mild diabetes or no diabetes with 858 (18.6%) of those who were readmitted. The fact that IADL limitations are the top tree node indicates it is the most important predictor in this CART model (Table 2).

Figure 2
figure 2

Classification and regression tree analysis for 30-day readmission. Each “path” down the tree represents a subgroup of the population with those characteristics. Bars represent the percentage of hospitalizations in that subgroup that resulted in a readmission. For instance, 786 people hospitalized had IADL limitations and severe diabetes (farthest right branch), and 26% of those were readmitted.

Results from the multivariable analysis (Table 3) showed that hospitalizations of patients with IADL limitations were associated with 1.17 (95% CI, 1.06–1.29, p = 0.002) times higher risk of readmission in the 30 days following hospital discharge, even after adjusting for other patient covariates. The presence of limitations in ADLs was associated between 1.10 (95% CI, 0.99–1.23, p = 0.08) times higher adjusted risk of readmission, but this was not statistically significant. The full model results are shown in Table 2 in the Appendix.

Table 3 Adjusted Risk of 30-Day Readmission Estimated from Multivariable Modified Poisson Regression Analysis

All models had only modest ability to discriminate between readmissions and non-readmissions as measured by the AUC. Random forest (both versions) performed best with an AUC of 0.612, followed by CART (AUC = 0.580), and GEE (AUC = 0.572). Additional measures of model performance are shown in Appendix Table 3.

DISCUSSION

To our knowledge, this is the first study using a machine learning approach on data from a nationally representative sample of fee-for-service Medicare beneficiaries to demonstrate the importance of IADL capacity in predicting hospital readmission. The importance of IADL dysfunction is evidenced by the fact that it is the first splitting variable in the classification tree, and the second variable listed in the random forest analysis. Other important conditions that emerged from our machine learning approach were limitations in ADLs, diabetes, cognitive impairment, and older age. These findings could be used by hospitals to screen for 30-day re-hospitalization risk using the variables identified here, and then providing additional general and targeted supportive services. Future research could examine the effectiveness of such interventions targeted toward those at the highest relative or absolute risk.

Prior studies have focused more on difficulty with ADLs than on IADL limitations, in great part because ADL deficiencies are strongly associated with institutionalization.29 However, difficulty with IADLs is more common among community-dwelling midlife and older adults30 and may be more actionable since tasks involved with IADLs are associated with independent living, and compensating for deficiencies may require less intensive interventions than ADL deficits. Limitations in these functions may hinder a person’s ability to self-manage following discharge from the hospital leading to subsequent readmission. Occupational therapy can improve the ability to self-manage chronic disease, but its effect on readmissions is less clear. Rogers et al. found a correlation between spending on occupational therapy and reduced readmissions among patients with heart failure, pneumonia, and acute myocardial infarction.31 Kumar et al. found a correlation between physical therapy use and reduced readmission for stroke patients, but no association between occupational therapy and readmissions.32 To our knowledge, a controlled study on the effectiveness of OT or PT to reduce readmission has not been conducted. Nonetheless, IADL limitations are modifiable and actionable and a condition for which a patient can be screened prior to discharge for referral to care transition interventions.33, 34 Thus, the findings in this study present a compelling case that IADL limitations should be screened for.

The HRRP penalizes hospitals for unplanned 30-day readmissions, and it is face-valid that most patients would prefer to not return to the hospital. However, controversy exists on whether or not this policy is helpful or harmful to patients. Wadhera et al. report an association between the announcement and implementation of HRRP and increased 30-day post-discharge mortality.35 While our study was focused on identifying predictors of readmission rather than the effectiveness of policies designed to prevent them, it is important to know whether identifying persons at high risk may inadvertently lead to hospital policies (e.g., discouraging admission of patients that actually need it to avoid payment penalties) that end up doing more harm than good. It is also not established what a reasonable benchmark for readmission rate, as some all-cause unplanned readmissions are unavoidable. Graham et al. estimate only 23% of general readmissions were preventable, while van Galen et al. estimate that only about 14.4% are preventable based on provider review.3, 4 For these reasons and others, some have called for making significant changes to readmissions as a quality metric for value-based care.36, 37

The effectiveness of interventions at preventing readmissions is mixed. A meta-analysis of randomized trials found a modest effect, with trials conducted before 2002 showing greater effect than more recent ones.38 Interventions that supported a patient’s capacity for self-care were the most effective, a finding congruent with our results. Some have argued that predictive models would be more useful if they include risk factors that are modifiable and actionable by existing interventions.39 Our study establishes IADL limitations as an independent risk factor for readmissions and lays the ground-work for RCTs of strategies targeted at IADLs to reduce readmission.

A major strength of this study is that multiple machine learning methods using automatic variable selection all suggest IADL limitation is an important predictor of readmission, attesting to the robustness of our findings. Random forest in particular is a powerful technique to assess and rank the importance of many variables at once. Because the random forest algorithm uses bootstrap sampling and only considers a small random selection of variables at each split, the method is able to handle variables that are correlated. The result is a more robust measure of how important each variable is compared with a single tree or a single linear model. CART analysis uses a non-parametric approach that can capture complex non-linear relationships without specifying which ones to investigate a priori. The decision tree model produced by CART is easy to interpret relative to many other machine learning methods. CART analysis can help us identify subgroups at either the highest absolute risk or highest relative risk. Which one is more useful may depend on what actions are to be taken with the information and the availability of resources. A drawback of CART is that although it chooses the best predictor at each split, this may not result in an overall optimal model. Both random forest and CART can detect interactions and other non-linear relationships automatically, whereas linear models require the researcher to specify these a priori—an infeasible task when the number of predictor variables is large.

A limitation of our study is that even our best models were only fair to moderate in terms of predictive ability. However, these results are consistent with many other attempts to predict all-cause 30-day readmission in a general inpatient population, which found AUCs of 0.55,40 0.56,41 0.61,42 and 0.65.43 Another key limitation of our approach is that because most of our covariates were derived from the HRS survey, these measures occur at a time point that could be up to 2 years prior to the actual index admission. Data measured at the time of the index hospital admission would be ideal for a predictive model. However, measures of ADL and IADL limitations are not typically included in models of readmission due to their lack of availability as structured data in many electronic health records and claims data. Our results suggest efforts should be made to collect ADL/IADL capacity information routinely prior to discharge and store in structured data fields (as opposed to clinician notes, which are unstructured). This could potentially allow the construction of better performing predictive models that combine ADL/IADL measures with other clinical data. Finally, we could not correct for the design effects of the multistage sampling design of the HRS in our machine learning analysis as methods to do so are not available.

CONCLUSION

This study demonstrates the importance of IADL limitations as a key predictor of 30-day hospital readmission through the use of multiple machine learning methods. Routine assessment of functional abilities in acute hospital care settings would help identify those most at risk. This could provide an opportunity to study the effectiveness of occupational and physical therapy–based interventions on reducing readmission.