The MCH Study was a prospective multi-center trial designed to assess the impact of hospitalist care on patients admitted to the general medicine services of six academic medical centers27–29. Patients were enrolled from July 1, 2001 through June 30, 2003 at the following six sites: University of Chicago, University of California San Francisco, University of Iowa, University of Wisconsin, University of New Mexico, and Brigham and Women’s Hospital in Boston. The study was approved by each site’s institutional review board.
Patients were eligible for inclusion if they were 18 years of age or older and were admitted by a hospitalist or other internist to a general medicine service. Patients admitted specifically under the care of their primary care physician were excluded.
Detailed sociodemographic and health information was collected during a 15–20 minute intake interview conducted by a research assistant, generally within 48 hours of admission. Additional data were obtained from each site’s administrative records and a telephone interview of patients or their proxies conducted 30 days after discharge. These data were matched with the National Death Index to ascertain 30-day mortality from the date of hospital discharge.
Administrative data were used to estimate length of stay and to ascertain age, sex, and insurance status. Intake interviews were used to administer the adult lifestyles and function interview mini-mental state exam (ALFI-MMSE),30,31 the Medical Outcomes Study Short Form 12 (SF12) questionnaire,32 and gather data on social supports, prior healthcare utilization, and health condition, including comorbidities for calculating a self-reported Charlson index33.
We retrospectively selected a subset of enrolled MCH Study patients for our analysis. First, we only included patients where they or their proxies could be interviewed in the hospital and therefore could provide timely data for our predictive models. We then excluded patients with a length of stay greater than 30 days to reduce bias from outlier effects. Next, we excluded patients not discharged to home, i.e., patients who died during hospitalization, were transferred to another healthcare facility, or left against medical advice. Lastly, we excluded patients who died within 30 days of discharge.
We defined hospital readmission as all-cause admission to an acute care hospital within 30 days of discharge from the index hospitalization. We identified readmissions in two ways: using administrative data from the study sites and from patient response to a specific question regarding hospital readmission included in the 30-day telephone follow-up. To minimize recall bias, administrative data were used to identify readmissions to each index hospital, while self-reported data were only used to identify readmissions to non-index hospitals.
We identified candidate patient factors likely to be associated with high readmission risk a priori from a survey of the relevant literature and grouped them into four natural categories as follows: (1) sociodemographic factors, including age, sex, self-reported race/ethnicity, self-reported total household income, education, and insurance status; (2) social support including, marital status, number of people living with patient, having someone to help at home, and having a regular physician; (3) health condition, including self-reported 0–9 Charlson comorbidity index, self-reported 0–100 health rating, 0–100 SF12 physical and mental component scores, 0–22 ALFI-MMSE score, and limitations in activities of daily living (ADLs) and/or instrumental activities of daily living (IADLs); and (4) healthcare utilization, including number of admissions in last one year, length of stay of the current hospital admission, and whether, given the choice, the patient would stay an extra day in the hospital even if their doctor told them they were well enough to go home.
The patient was the unit of analysis. Because of our large sample size, we chose a split-sample design to derive and internally validate our prediction model. We randomly selected two thirds of patients from each site and combined them to create a derivation cohort and subsequently combined the remaining one third of patients from each site to create a validation cohort34.
To assess whether the candidate patient factors were significantly associated with hospital readmission, we fitted separate multivariable logistic regression models for each of the four categories of patient factors using data from the derivation cohort. We used P < 0.10 as the cutoff for assessing significance. Only factors noted to be significantly associated with readmission within their respective categories were included in the final regression model. Generalized estimating equations (GEE) were used to account for clustering by discharging physician, and hospital site was entered as a fixed effect in each of the models to minimize confounding35.
When constructing the final model, factors that became non-significant at P > 0.05 were removed if their presence did not change the beta-coefficient for any other factor by more than 20%. We derived a scoring system by multiplying each beta coefficient by ten and rounding to the nearest integer; the integer values from all applicable factors were then added together to estimate a total score for each patient. We subsequently obtained score-based predicted probabilities of readmission by entering each patient’s risk score into a single-predictor logistic regression model and used the output from this model to determine score cutoffs for identifying patients within selected readmission risk levels (0–9%, 10–19%, 20–29%, and 30% or higher).
We tested the performance of our model using data from the validation cohort. We assessed goodness of fit using the The Hosmer–Lemeshow chi-square test36 and model discrimination by measuring the C statistic, which is the area under the receiver operating characteristic (ROC) curve37. Because patients discharged to sub-acute or long-term care facilities are an important patient population but might have different predictors of readmission, we repeated our methodology in this population. We used SAS statistical software (Version 9.1; SAS Inc, North Carolina) to perform all analyses.