Abstract
Hospital “report cards” reporting risk-adjusted health outcomes are increasingly used to benchmark quality of care. However, risk adjustment methods that do not fully account for the interrelationship between quality, risks and outcomes may lead to biased quality measures. This study aims to determine whether the current approach based on logistic regression and observed-to-expected outcome comparisons (O−E difference or O/E ratio) provides unbiased measures of quality. We first provided a conceptual framework to demonstrate that O−E difference or O/E ratio is inconsistently specified when estimates are based on logistic risk adjustment models. To examine the misspecification issue empirically, risk adjustment was performed based on coronary artery bypass graft (CABG) surgery data from New York’s Cardiac Surgery Reporting System, and quality indicators (QI) of different specifications were calculated for hospital profiling. Computer simulations further explored the issue of misspecified QIs. Results showed that risk-adjusted mortality rates (RAMR) calculated from different QIs identified the same hospital outliers based on 95% confidence intervals, but generated different rank orders for hospitals in both high-quality and low-quality tails of the quality distributions. Simulation results further showed that, compared to O−E and O/E, logistically transformed QIs were superior regarding their abilities to identify hospitals of true extreme rankings, especially when the outcome was less prevalent or the number of patients per hospital was small. Based on our findings, we recommend that analysts consider the use of logistically transformed QI prior to publicly releasing quality rankings using measures based on O−E or O/E.
Similar content being viewed by others
Notes
We can only observe patient’s actual outcome status O ij . A hospital’s observed average outcome rate \({\bar {O}_j =\frac{1}{n_j}\sum_{i=1}^{n_j}{(O_{ij})} =\frac{1}{n_j}\sum_{i=1}^{n_j}{(p_{ij})}}\) when n j is large.
Here we assume that the chance effect is additive. If it is multiplicative, e.g., \({{\bf logit}(p_{ij})={\bf logit}(E_{ij} )\times Q_j \times \varepsilon _{ij}}\), the model is additive after a log-transformation. Further transformation upon the logit-function, however, would make model estimation intractable and is not discussed here.
The NYS model for quality report did not use robust variance estimates. Our analyses indicated that applying hierarchical regression model to the CSRS data did not substantially change the result of hospital profiling.
Normality tests based on simulated data of logit(p ij )showed that it is approximately normally distributed. We alternatively assumed normal distribution for the error term \({\varepsilon _{ij}}\). The distribution of \({{\bf logit}(p_{ij})}\) and results of the misspecified quality indicator remained unchanged.
The κ measures the level of agreement between two raters evaluating an event on a categorical scale (Landis and Koch 1977). In this study, we defined the event scale as 1 = high-quality outlier, 0 = non-quality outlier, and −1 = low-quality outlier.
In the Bootstrap simulation to calculate the 95% CI of each QI, we changed the dataset for the two hospitals with no death. We changed the death/survival status to death for the patient with highest predicted probability of death in each hospital to ensure that the bootstrapped CI can be defined. Because of this change, the number of identified quality outliers is different from that in the CABG report card based on original data.
References
Ash, A.S., Shwartz, M., Pekoz, E.A.: Comparing outcomes across providers. In: Iezzoni, L.I. (ed.) Risk Adjustment for Measuring Health Care Outcomes. Health Administration Press, Chicago Illinois (2003)
Chassin, M.R., Hannan, E.L., DeBuono, B.A.: Benefits and hazards of reporting medical outcomes publicly. N. Engl. J. Med. 334, 394–398 (1996)
Christiansen, C.L., Morris, C.N.: Improving the statistical approach to health care provider profiling. Ann. Intern. Med. 127, 764–768 (1997)
Conrad, D.A., Christianson, J.B.: Penetrating the black box financial incentives for enhancing the quality of physician services. Med. Care Res. Rev. 61, 37S–68S (2004)
Efron, F., Tibshirani, R.J.: An Introduction to the Bootstrap. New York, Chapman & Hall (1993)
Gatsonis, C., Normand, S.L., Liu, C., Morris, C.: Geographic variation of procedure utilization. A hierarchical model approach. Med. Care 31, YS54–YS59 (1993)
Glance, L.G., Osler, T., Shinozaki, T.: Effect of varying the case mix on the standardized mortality ratio and W statistic: a simulation study. Chest 117, 1112–1117 (2000)
Glance L.G., Dick A., Osler T.M., Li Y., Mukamel D.B.: Impact of changing the statistical methodology on hospital and surgeon ranking: the case of the New York State Cardiac Surgery Report Card. Med. Care 44, 311–319 (2006)
Gould, W., Sribney, W.: Maximum Likelihood Estimation with STATA. College Station, Texas (1999)
Greene, W.H.: Econometric Analysis. Upper Saddle River, Prentice Hall (2001)
Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36 (1982)
Hannan, E.L., Wu, C., DeLong, E.R., Raudenbush, S.W.: Predicting risk-adjusted mortality for CABG surgery: logistic versus hierarchical logistic models. Med. Care 43, 726–735 (2005)
Iezzoni, L.I.: The risks of risk adjustment. JAMA 278, 1600–1607 (1997)
Iezzoni, L.I. (ed).: Risk Adjustment for Measuring Health Care Outcomes. Health Administration Press Illinois, Chicago (2003)
Iezzoni, L.I., Ash, A.S., Shwartz, M., Daley, J., Hughes, J.S., Mackiernan, Y.D.: Judging hospitals by severity-adjusted mortality rates: the influence of the severity-adjustment method. Am. J. Public Health 86, 1379–1387 (1996a)
Iezzoni, L.I., Shwartz, M., Ash, A.S, Hughes, J.S, Daley, J., Mackiernan, Y.D.: Severity measurement methods and judging hospital death rates for pneumonia. Med. Care 34, 11–28 (1996b)
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)
Mukamel, D.B., Mushlin, A.I.: The impact of quality report cards on choice of physicians, hospitals and HMOs: – a midcourse evaluation. Jt. Comm. J. Qual. Improve. 27, 20 (2001)
Mukamel, D.B., Dick, A., Spector, W.D.: Specification issues in measurement of quality of medical care using risk adjusted outcomes. J. Econ. Soc. Meas. 26, 267–281 (2000)
Mukamel, D.B., Weimer, D.L., Zwanziger, J., Mushlin, A.I.: Quality of cardiac surgeons and managed care contracting practices. Health Services Res. 37, 1129 (2002)
Mukamel, D.B., Watson, N.M., Meng, H., Spector, W.D.: Development of a risk-adjusted urinary incontinence outcome measure of quality for nursing homes. Med. Care 41, 467–478 (2003)
Mukamel, D.B., Weimer, D.L., Zwanziger, J., Huang-Gorthy, S., Mushlin, A.I.: Quality report cards, selection of cardiac surgeons and racial disparities: a study of the publication of the NYS Cardiac Surgery Reports. Inquiry 41, 435–446 (2004/2005)
New York State Department of Health: Coronary artery bypass surgery in New York State, 2000–2002. Albany, NY (2004)
Pennsylvania Health Care Cost Containment Council: A Consumer’s Guide to Coronary Artery Bypass Surgery, Vol III. Harrisburg, PA, PH4C (1994)
Romano, P.S., Zach, A., Luft, H.S., Rainwater, J., Remy, L.L., Campa, D.: The California Hospital Outcomes Project: using administrative data to compare hospital performance. Jt. Comm. J. Qual. Improv. 21, 668–682 (1995)
Rosenthal, M.B., Frank, R.G., Li, Z., Epstein, A.M.: Early experience with pay-for-performance: from concept to practice. JAMA 294, 1788–1793 (2005)
Sacco, W.J., Copes, W.S., Staz, C.F., Smith, J.S., Jr., Buckman, R.F., Jr.: Status of trauma patient management as measured by survival/death outcomes: looking toward the 21st century. J. Trauma 36, 297–298 (1994)
Schuster, D.P.: Predicting outcome after ICU admission. The art and science of assessing risk. Chest bf 102, 1861–1870 (1992)
Shahian, D.M., Normand, S.L., Torchiana, D.F., Lewis, S.M., Pastore, J.O., Kuntz, R.E., Dreyer, P.I.: Cardiac surgery report cards: comprehensive review and statistical critique. Ann. Thorac. Surg. 72, 2155–2168 (2001)
White H.: A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48, 817–830 (1980)
Zaslavsky, A.M.: Statistical issues in reporting quality data: small samples and casemix variation. Int. J. Qual. Health Care 13, 481–488 (2001)
Acknowledgements
This project was supported by a grant from the Agency for Healthcare Research and Quality (RO1 HS 13617, Dr. Laurent Glance)
The views presented in this manuscript are those of the authors and may not reflect those of the Agency for Healthcare Research and Quality, of the New York State Department of Health or of the Cardiac Advisory Committee.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Appendix 1: Proof of consistent quality indicator specifications
The O−E difference is consistent with the additive linear probability function below:
in which x ij represents the set of observed risk factors of the patient, β stands for the effect of risks on patient outcome, Q j is hospital J ’s quality of care that interacts in an additive fashion with patient’s observed risks and can be estimated as either fixed- or random-effects; \({\varepsilon _{ij}}\) represents the random (chance) effect of unknown factors on patient outcome, and \({E_{ij} =x_{ij} \beta'}\) is patient’s predicted outcome rate. We assume that the expectation (average) of the chance effect over all patients in hospital j equals zero.
That is,
If we rearrange Eq. 4 and take expectation over hospital j on both sides, we obtain
because of the assumption in Eq. 5, where Q d j is hospital j’s risk-adjusted QI assuming the difference specification.
Similarly, the O/E ratio is consistent with the linear probability function below:
in which hospital quality Q j interacts multiplicatively with a patient’s observed risks. If we assume that Eq. 5 also holds here and take expectation over hospital j in Eq. 6, we obtain
because of the assumption in Eq. 5, where Q r j is hospital j’s risk-adjusted QI measured by the ratio specification.
1.2 Appendix 2
Before simulating the additive logistic specification (Eq. 2), we estimated additive fixed-effects logistic regression on the empirical CABG data:
in which x ij is the set of risk factors (Table 3), β is the coefficient to be estimated. The fixed-effects q j were estimated as the coefficient of dummy variables for hospitals (hospital 1 was omitted, q 1 = 0). Robust variance estimates (White 1980) were obtained to account for the clustering of patients in hospitals.
After model estimation, we calculated the expected mortality as
in which n j is the number of patients in hospital j, N = 16,120 is the total sample size. Therefore, a patient’s expected outcome was calculated as a linear function of patient risk (\({x_{ij} \hat{\beta}^{\prime})}\) plus the weighted-average quality effect assuming that the patient were treated in an average quality hospital in New York. Finally, consistent QI (Q ld j ) was constructed. Estimates from this model were then used for simulating the additive Eq. 2.
Before simulating the multiplicative Eq. 3, we estimated multiplicative fixed-effects logistic regression on the empirical data:
For the model to be identified, we normalized the effect of hospital 1 to unit. Maximum likelihood estimation in STATA (Gould and Sribney 1999) was used to obtain robust variance estimates.
The expected mortality was calculated as
in which the weighted-average quality effect takes a multiplicative form on patient risks. Finally, consistent QI (Q lr j ) was constructed. Estimates based on this model were used for subsequent simulations.
Rights and permissions
About this article
Cite this article
Li, Y., Dick, A.W., Glance, L.G. et al. Misspecification issues in risk adjustment and construction of outcome-based quality indicators. Health Serv Outcomes Res Method 7, 39–56 (2007). https://doi.org/10.1007/s10742-006-0014-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10742-006-0014-z