Skip to main content

Advertisement

Log in

Comparison of Outlier Identification Methods in Hospital Surgical Quality Improvement Programs

  • 2010 SSAT Plenary Presentation
  • Published:
Journal of Gastrointestinal Surgery Aims and scope

Abstract

Background

Surgeons and hospitals are being increasingly assessed by third parties regarding surgical quality and outcomes, and much of this information is reported publicly. Our objective was to compare various methods used to classify hospitals as outliers in established surgical quality assessment programs by applying each approach to a single data set.

Methods

Using American College of Surgeons National Surgical Quality Improvement Program data (7/2008–6/2009), hospital risk-adjusted 30-day morbidity and mortality were assessed for general surgery at 231 hospitals (cases = 217,630) and for colorectal surgery at 109 hospitals (cases = 17,251). The number of outliers (poor performers) identified using different methods and criteria were compared.

Results

The overall morbidity was 10.3% for general surgery and 25.3% for colorectal surgery. The mortality was 1.6% for general surgery and 4.0% for colorectal surgery. Programs used different methods (logistic regression, hierarchical modeling, partitioning) and criteria (P < 0.01, P < 0.05, P < 0.10) to identify outliers. Depending on outlier identification methods and criteria employed, when each approach was applied to this single dataset, the number of outliers ranged from 7 to 57 hospitals for general surgery morbidity, 1 to 57 hospitals for general surgery mortality, 4 to 27 hospitals for colorectal morbidity, and 0 to 27 hospitals for colorectal mortality.

Conclusions

There was considerable variation in the number of outliers identified using different detection approaches. Quality programs seem to be utilizing outlier identification methods contrary to what might be expected, thus they should justify their methodology based on the intent of the program (i.e., quality improvement vs. reimbursement). Surgeons and hospitals should be aware of variability in methods used to assess their performance as these outlier designations will likely have referral and reimbursement consequences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Institute of Medicine (US). Committee on Quality of Health Care in America. Crossing the quality chasm a new health system for the 21st century. Washington, DC: National Academy, 2001.

  2. Institute of Medicine (US). Committee on Quality of Health Care in America, Kohn LT, Corrigan J, Donaldson MS. To err is human: building a safer health system. Washington, DC: National Academy, 2000.

  3. Institute of Medicine (US). Committee on Redesigning Health Insurance Performance Measures Payment and Performance Improvement Programs. Performance measurement: accelerating improvement. Washington, DC: National Academies, 2006.

  4. Fung CH, Lim YW, Mattke S, et al. Systematic review: the evidence that publishing patient care performance data improves quality of care. Ann Intern Med 2008; 148(2):111–23.

    PubMed  Google Scholar 

  5. Russell EM, Bruce J, Krukowski ZH. Systematic review of the quality of surgical mortality monitoring. Br J Surg 2003; 90(5):527–32.

    Article  CAS  PubMed  Google Scholar 

  6. Faber M, Bosch M, Wollersheim H, et al. Public reporting in health care: how do consumers use quality-of-care information? A systematic review. Med Care 2009; 47(1):1–8.

    Article  PubMed  Google Scholar 

  7. American College of Surgeons. National Surgical Quality Improvement Program. Available at www.facs.org/acsnsqip.html. Accessed April 15, 2009.

  8. Khuri SF, Daley J, Henderson W, et al. The Department of Veterans Affairs’ NSQIP: the first national, validated, outcome-based, risk-adjusted, and peer-controlled program for the measurement and enhancement of the quality of surgical care. National VA Surgical Quality Improvement Program. Ann Surg 1998; 228(4):491–507.

    CAS  Google Scholar 

  9. Daley J, Khuri SF, Henderson W, et al. Risk adjustment of the postoperative morbidity rate for the comparative assessment of the quality of surgical care: results of the National Veterans Affairs Surgical Risk Study. J Am Coll Surg 1997; 185(4):328–40.

    CAS  PubMed  Google Scholar 

  10. Khuri SF, Henderson WG, Daley J, et al. The patient safety in surgery study: background, study design, and patient populations. J Am Coll Surg 2007; 204(6):1089–102.

    Article  PubMed  Google Scholar 

  11. Davis CL, Pierce JR, Henderson W, et al. Assessment of the reliability of data collected for the Department of Veterans Affairs national surgical quality improvement program. J Am Coll Surg 2007; 204(4):550–60.

    Article  PubMed  Google Scholar 

  12. Khuri SF, Henderson WG, Daley J, et al. Successful implementation of the Department of Veterans Affairs’ National Surgical Quality Improvement Program in the private sector: the Patient Safety in Surgery study. Ann Surg 2008; 248(2):329–36.

    Article  PubMed  Google Scholar 

  13. American College of Surgeons. National Surgical Quality Improvement Program: Resources, Program Information. Available at https://acsnsqip.org/main/resources_downloads.asp. Accessed April 15, 2009.

  14. Bilimoria K, Cohen M, Ingraham A, et al. Effect of post-discharge morbidity and mortality on comparisons of hospital surgical quality. Ann Surg 2010 (in press).

  15. American College of Surgeons. National Surgical Quality Improvement Program: Semiannual Report. Available at https://acsnsqip.org/main/resources_semi_annual_report.pdf. Accessed April 15, 2009.

  16. American College of Surgeons. National Surgical Quality Improvement Program. Available at www.facs.org/acsnsqip.html. Accessed April 9, 2010.

  17. Surgical Care and Outcomes Assessment Program (SCOAP). Available at http://www.scoap.org/downloads/Q1-2009-Hospital-Report-XX.pdf. Last accessed April 9, 2010.

  18. Hamilton BH, Ko CY, Richards K, Hall BL. Missing data in the American College of Surgeons National Surgical Quality Improvement Program are not missing at random: implications and potential impact on quality assessments. J Am Coll Surg; 210(2):125–139 e2.

  19. Sun J, Ono Y, Takebuchi Y. A simple method for calculating the exact confidence interval of the standardized mortality ratio with an [sic] SAS function. Journal of Occupational Health 1996; 38:196–197.

    Article  Google Scholar 

  20. Jones HE, Ohlssen DI, Spiegelhalter DJ. Use of the false discovery rate when comparing multiple health care providers. J Clin Epidemiol 2008; 61(3):232–240.

    Article  PubMed  Google Scholar 

  21. Austin PC, Naylor CD, Tu JV. A comparison of a Bayesian vs. a frequentist method for profiling hospital performance. J Eval Clin Pract 2001; 7(1):35–45.

    Article  CAS  PubMed  Google Scholar 

  22. Dai J, Li Z, Rocke D. Hierarchical logistic regression modeling with SAS GLIMMIX. Available at: www.lexjansen.com/wuss/2006/Analytics/ANL-Dai.pdf, 2006.

  23. Houchens R, Chu B, Steiner C. Hierachical modeling using HCUP data. US Agency for Healthcare and Research Quality 2007.

  24. SAS. The GLIMMIX Procedure, June 2006. Available at: http:/support.sas.com/rnd/app/papers/glimmix.pdf, 2006.

  25. Cohen ME, Dimick JB, Bilimoria KY, et al. Risk adjustment in the American College of Surgeons national surgical quality improvement program: a comparison of logistic versus hierarchical modeling. J Am Coll Surg 2009; 209(6):687–93.

    Article  PubMed  Google Scholar 

  26. New York State Department of Health. Adult Cardiac Surgery in New York State. Available at http://www.nyhealth.gov/statistics/diseases/cardiovascular/. Last accessed April 9, 2010.

  27. California Office of Statewide Health Planning and Development (OSHPD). Available at http://www.oshpd.ca.gov/HID/DataFlow/HospQuality.html. Last accessed April 9, 2010.

  28. The Quality Oncology Practice Initiative (QOPI). Available at http://qopi.asco.org/index. Last accessed April 9, 2010.

  29. HealthGrades. Available at http://www.healthgrades.com. Last accessed April 9, 2010.

  30. Society of Thoracic Surgeons. http://www.sts.org/sections/stsnationaldatabase/ Accessed April 9. 2010.

  31. US Department of Health and Human Resources. Hospital Compare. http://www.hospitalcompare.hhs.gov. Accessed April 15, 2009.

  32. Fedeli U, Brocco S, Alba N, et al. The choice between different statistical approaches to risk-adjustment influenced the identification of outliers. J Clin Epidemiol 2007; 60(8):858–62.

    Article  PubMed  Google Scholar 

  33. Glance LG, Dick A, Osler TM, et al. Impact of changing the statistical methodology on hospital and surgeon ranking: the case of the New York State cardiac surgery report card. Med Care 2006; 44(4):311–9.

    Article  PubMed  Google Scholar 

  34. Ghaferi AA, Birkmeyer JD, Dimick JB. Variation in hospital mortality associated with inpatient surgery. N Engl J Med 2009; 361(14):1368–75.

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karl Y. Bilimoria.

Additional information

Discussant

Dr. Thomas J. Watson (Rochester, NY): The issue of quality assessment, and how the data might be utilized by patients, payers, and regulatory agencies for directing care, as well as by hospitals for targeting their improvement initiatives, is certainly gaining a lot of attention among surgeons. Yet as the authors so nicely demonstrate, the manner in which quality outliers are identified varies widely based on lack of uniformity, methodology, and cut-off criteria. We are all quite indebted to the authors for bringing these inconsistencies into the light.

The manuscript is likely to fuel a significant debate regarding which methods and boundaries are appropriate for different purposes. The results of such a debate could have significant impact on institutions that fall just above or just below established thresholds.

I have two questions for the authors.

Number one, is a certain methodology more suitable than others based upon the width or standard deviation of the outcomes’ distribution? As an example, ranking hospitals in quintiles may not make sense when the outcomes are clustered closely together. Perhaps setting a minimum threshold would be more appropriate in such a circumstance.

Number two, if you were appointed health care czar today, which methodology and cut-offs would you choose?

Closing Discussant

Dr. Karl Y. Bilimoria: I think that the method selected obviously depends on the measure. And certainly, if it’s something like beta blocker post MI, where everybody is 95% plus, the range is going to be narrow. So setting up different criteria for that, a sort of a basement threshold, would be better.

The vast majority of measures that we see that are like this-where there is wide variation. I think it depends entirely upon the intent, whether it’s for a quality improvement initiative or whether it’s to be publicly disseminated with referral and reimbursement consequences.

Similarly, it would depend what I was using the measure for. But for NSQIP, I favor using quintiles or quartiles.

Discussant

Dr. Keith D. Lillemoe (Indianapolis, IN): I’m not going to make you the health care czar, but I’m going to make you the chair of a department of surgery. I get these kind of numbers and they are not made up. What would you recommend for either myself, your chair, Dr. Soper and any other surgical chair do with this data and to try to institute quality improvement, because this isn’t so much about persecuting the bad people, it’s trying to lift up the quality.

Regardless of the metric that you look at, we are all going to have some underperformers or outliers. What’s the step in instituting quality improvement?

Closing Discussant

Dr. Karl Y. Bilimoria: I think the first step will be bringing it to light and providing people their data and making sure it is high quality data. I think that we have a lack of that right now. Although you may get some reports, I think, providing detailed, high quality data back to the individual performers is something that’s been lacking in general.

Also, it’s not about the absolute number or where you rank. It’s about just showing, what half of the group you are in. And if you are in the lower half, that’s something to act on.

Finally, actually demonstrating performance improvement or at least some activity toward improving performance is needed. In some of these measures, the numbers are very small, so demonstrating absolute improvement in outcomes would be difficult. But at least on process measures, those are more absolute, and maybe we can improve there in these circumstances.

Discussant

Dr. Sharon Weber (Madison, WI): I find this whole concept a little bit disturbing in light of the era of public reporting of the outcomes. And I would be even more disturbed if the hospitals identified as low outliers changed when different methodologies were applied. Did you evaluate the specific hospitals that were identified at each end of the scale and whether they changed position when different methodologies were applied, especially for the low outliers?

Closing Discussant

Dr. Karl Y. Bilimoria: For the most part, the really low-performing hospitals are the same across most of the models. When you get to the better performing of the low-performers, there is some variation in the nature of the hospitals. So some do flip in and out of being an outlier.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bilimoria, K.Y., Cohen, M.E., Merkow, R.P. et al. Comparison of Outlier Identification Methods in Hospital Surgical Quality Improvement Programs. J Gastrointest Surg 14, 1600–1607 (2010). https://doi.org/10.1007/s11605-010-1316-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11605-010-1316-6

Keywords

Navigation