Comparing Outcomes and Costs of Medical Patients Treated at Major Teaching and Non-teaching Hospitals: A National Matched Analysis

  • Jeffrey H. SilberEmail author
  • Paul R. Rosenbaum
  • Bijan A. Niknam
  • Richard N. Ross
  • Joseph G. Reiter
  • Alexander S. Hill
  • Lauren L. Hochman
  • Sydney E. Brown
  • Alexander F. Arriaga
  • Lee A. Fleisher
Original Research



Teaching hospitals typically pioneer investment in new technology and cultivate workforce characteristics generally associated with better quality, but the value of this extra investment is unclear.


Compare outcomes and costs between major teaching and non-teaching hospitals by closely matching on patient characteristics.


Medicare patients at 339 major teaching hospitals (resident-to-bed (RTB) ratios ≥ 0.25); matched patient controls from 2439 non-teaching hospitals (RTB ratios < 0.05).


Forty-three thousand nine hundred ninety pairs of patients (one from a major teaching hospital and one from a non-teaching hospital) admitted for acute myocardial infarction (AMI), 84,985 pairs admitted for heart failure (HF), and 74,947 pairs admitted for pneumonia (PNA).


Treatment at major teaching hospitals versus non-teaching hospitals.

Main Measures

Thirty-day all-cause mortality, readmissions, ICU utilization, costs, payments, and value expressed as extra cost for a 1% improvement in survival.

Key Results

Thirty-day mortality was lower in teaching than non-teaching hospitals (10.7% versus 12.0%, difference = − 1.3%, P < 0.0001). The paired cost difference (teaching − non-teaching) was $273 (P < 0.0001), yielding $211 per 1% mortality improvement. For the quintile of pairs with highest risk on admission, mortality differences were larger (24.6% versus 27.6%, difference = − 3.0%, P < 0.0001), and paired cost difference = $1289 (P < 0.0001), yielding $427 per 1% mortality improvement at 30 days. Readmissions and ICU utilization were lower in teaching hospitals (both P < 0.0001), but length of stay was longer (5.5 versus 5.1 days, P < 0.0001). Finally, individual results for AMI, HF, and PNA showed similar findings as in the combined results.

Conclusions and Relevance

Among Medicare patients admitted for common medical conditions, as admission risk of mortality increased, the absolute mortality benefit of treatment at teaching hospitals also increased, though accompanied by marginally higher cost. Major teaching hospitals appear to return good value for the extra resources used.


Medicare value teaching hospitals mortality cost 



We thank Traci Frank, AA; Kathryn Yucha, MSN, RN; and Sujatha Changolkar (Center for Outcomes Research, The Children’s Hospital of Philadelphia, Philadelphia, PA, USA) for their assistance with this research.

Compliance with Ethical Standards

Conflict of Interest

This research was funded by a grant from the Association of American Medical Colleges (AAMC) to study differences between teaching and non-teaching hospitals on outcomes, costs and value. AAMC had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, and approval of the manuscript; or decision to submit the manuscript for publication.

The authors declare that the Children’s Hospital of Philadelphia (CHOP) and the University of Pennsylvania (PENN) received a research grant from the Association of American Medical Colleges (AAMC) which, in turn, provided partial salary support for some investigators.

Supplementary material

11606_2019_5449_MOESM1_ESM.docx (593 kb)
ESM 1 (DOCX 416 kb).


  1. 1.
    Taylor DH, Whellan DJ, Sloan FA. Effects of admission to a teaching hospital on the cost and quality of care for Medicare beneficiaries. N Engl J Med. 1999;340:293–99.PubMedCrossRefGoogle Scholar
  2. 2.
    Allison JJ, Kiefe CI, Weissman NW, et al. Relationship of hospital teaching status with quality of care and mortality for Medicare patients with acute MI. JAMA. 2000;284:1256–62.PubMedCrossRefGoogle Scholar
  3. 3.
    Rosenthal GE, Harper DL, Quinn LM, Cooper GS. Severity-adjusted mortality and length of stay in teaching and nonteaching hospitals. Results of a regional study. JAMA. 1997;278:485–90.PubMedCrossRefGoogle Scholar
  4. 4.
    Navathe A, Silber JH, Zhu J, Volpp KE. Does admission to a teaching hospital affect acute myocardial infarction survival? Acad Med. 2013;88:475–82.PubMedPubMedCentralCrossRefGoogle Scholar
  5. 5.
    Burke LG, Khullar D, Zheng J, Frakt AB, Orav EJ, Jha AK. Comparison of costs of care for Medicare patients hospitalized in teaching and nonteaching hospitals. JAMA Netw Open. 2019;2:e195229.PubMedPubMedCentralCrossRefGoogle Scholar
  6. 6.
    Khullar D, Frakt AB, Burke LG. Advancing the academic medical center value debate. Are teaching hospitals worth it? JAMA. 2019;322:205–06.PubMedCrossRefGoogle Scholar
  7. 7.
    Burke L, Khullar D, Orav EJ, Zheng J, Frakt A, Jha AK. Do academic medical centers disproportionately benefit the sickest patients? Health Aff (Millwood). 2018;37:864–72.PubMedCrossRefGoogle Scholar
  8. 8.
    Burke LG, Frakt AB, Khullar D, Orav EJ, Jha AK. Association between teaching status and mortality in US hospitals. JAMA. 2017;317:2105–13.PubMedPubMedCentralCrossRefGoogle Scholar
  9. 9.
    Ayanian JZ, Weissman JS, Chasan-Taber S, Epstein AM. Quality of care for two common illnesses in teaching and nonteaching hospitals. Health Aff. 1998;17:194–205.CrossRefGoogle Scholar
  10. 10.
    Pimentel SD, Kelz RR, Silber JH, Rosenbaum PR. Large, sparse optimal matching with refined covariate balance in an observational study of the health outcomes produced by new surgeons. J Am Stat Assoc. 2015;110:515–27.PubMedPubMedCentralCrossRefGoogle Scholar
  11. 11.
    Healthcare Cost and Utilization Project (HCUP). HCUP Fast Stats—Most Common Diagnoses for Inpatient Stays Rockville, MD: Agency for Healthcare Research and Quality, 2019. Available at: Accessed July 17, 2019.
  12. 12.
    Silber JH, Rosenbaum PR, Ross RN, et al. Indirect standardization matching: Assessing specific advantage and risk synergy. Health Serv Res. 2016;51:2330–57.PubMedPubMedCentralCrossRefGoogle Scholar
  13. 13.
    Silber JH, Rosenbaum PR, Wang W, et al. Auditing practice style variation in pediatric inpatient asthma care. JAMA Pediatr. 2016;170:878–86.PubMedCrossRefGoogle Scholar
  14. 14.
    Silber JH, Rosenbaum PR, McHugh MD, et al. Comparison of the value of nursing work environments in hospitals across different levels of patient risk. JAMA Surg. 2016;151:527–36.PubMedPubMedCentralCrossRefGoogle Scholar
  15. 15.
    Burwell SM. Setting value-based payment goals—HHS efforts to improve U.S. health care. N Engl J Med. 2015;372:897–9.PubMedCrossRefGoogle Scholar
  16. 16.
    Hansen BB. The prognostic analogue of the propensity score. Biometrika. 2008;95:481–88.CrossRefGoogle Scholar
  17. 17.
  18. 18.
    Volpp KG, Rosen AK, Rosenbaum PR, et al. Mortality among hospitalized Medicare beneficiaries in the first 2 years following ACGME resident duty hour reform. JAMA. 2007;298:975–83.PubMedCrossRefPubMedCentralGoogle Scholar
  19. 19.
    Ayanian JZ, Weissman JS. Teaching hospitals and quality of care: A review of the literature. Milbank Q. 2002;80:569–93.PubMedPubMedCentralCrossRefGoogle Scholar
  20. 20.
    Yale New Haven Health Services Corporation- Center for Outcomes Research and Evaluation. 2016 Condition-Specific Measures Updates and Specifications Report: Hospital-Level 30-day Risk-Standardized Mortality Measures. Acute Myocardial Infarction. Version 10.0. 2016. Available at: Accessed February 7, 2017.
  21. 21.
    Niknam BA, Arriaga AF, Rosenbaum PR, et al. Adjustment for atherosclerosis diagnosis distorts the effects of percutaneous coronary intervention and the ranking of hospital performance. J Am Heart Assoc. 2018;7:pii: e008366.Google Scholar
  22. 22.
    Rosenbaum PR. Chapter 5: Between Observational Studies and Experiments. In: Observation and Experiment An Introduction to Causal Inference. Cambridge: Harvard University Press; 2017:90–96.CrossRefGoogle Scholar
  23. 23.
    Rosenbaum P, Rubin D. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55.CrossRefGoogle Scholar
  24. 24.
    SAS Institute. Chapter 53: The LOGISTIC Procedure. In: SAS/STAT 93 User’s Guide. Cary: SAS Institute, Inc.; 2011:4033–267. Available at: Accessed February 29, 2008.Google Scholar
  25. 25.
    SAS Institute. Chapter 77: The ROBUSTREG Procedure. In: SAS/STAT 93 User’s Guide. Cary: SAS Institute Inc.; 2011:6531–625. Available at: Accessed August 6, 2018.Google Scholar
  26. 26.
    Huber PJ. Robust Statistics. Hoboken: Wiley; 1981.CrossRefGoogle Scholar
  27. 27.
    Hampel FR, Ronchett EM, Rousseeuw PJ, Stahel WA. Chapter 6. Linear Models: Robust Estimation. Section 6.3. M-Estimators for Linear Models. In: Robust Statistics The Approach Based on Influence Functions. 2nd ed. New York: Wiley; 1986:315–28.Google Scholar
  28. 28.
    Centers for Medicare & Medicaid Services. Provider of Services Current Files Baltimore, MD: 2017. Available at: Accessed June 9, 2017.
  29. 29.
    Silber JH, Arriaga AF, Niknam BA, Hill AS, Ross RN, Romano PS. Failure-to-rescue after acute myocardial infarction. Med Care. 2018;56:416–23.PubMedCrossRefGoogle Scholar
  30. 30.
    Silber JH, Rosenbaum PR, Kelz RR, et al. Medical and financial risks associated with surgery in the elderly obese. Ann Surg. 2012;256:79–86.PubMedPubMedCentralCrossRefGoogle Scholar
  31. 31.
    Halpern NA, Pastores SM. Critical care medicine in the United States 2000-2005: An analysis of bed numbers, occupancy rates, payer mix, and costs. Crit Care Med. 2010;38:65–71.PubMedCrossRefGoogle Scholar
  32. 32.
    Krinsky S, Ryan AM, Mijanovich T, Blustein J. Variation in payment rates under Medicare’s inpatient prospective payment system. Health Serv Res. 2017;52:676–96.PubMedCrossRefGoogle Scholar
  33. 33.
    Neumann PJ, Cohen JT, Weinstein MC. Updating cost-effectiveness—the curious resilience of the $50,000-per-QALY threshold. N Engl J Med. 2014;371:796–7.PubMedCrossRefGoogle Scholar
  34. 34.
    Efron B, Gong G. A leisurely look at the bootstrap, the jackknife, and cross-validation. Am Stat. 1983;37:36–48.Google Scholar
  35. 35.
    SAS Institute. Version 9.4 of the Statistical Analytic Software System for UNIX. Cary: SAS Institute, Inc.; 2013.Google Scholar
  36. 36.
    Rosenbaum PR. Chapter 9: Various Practical Issues in Matching. 9.2 Almost Exact Matching. In: Design of Observational Studies. New York: Springer; 2010:190–92.Google Scholar
  37. 37.
    Rubin DB. Bias reduction using Mahalanobis metric matching. Biometrics. 1980;36:293–98.CrossRefGoogle Scholar
  38. 38.
    Rubin DB. The design versus the analysis of observational studies for causal effects: Parallels with the design of randomized trials. Stat Med. 2007;26:20–36.PubMedCrossRefGoogle Scholar
  39. 39.
    Rosenbaum PR, Rubin DB. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. Am Stat. 1985;39:33–38.Google Scholar
  40. 40.
    Cochran WG, Rubin DB. Controlling bias in observational studies. A review. Sankhya, Series A. 1973;35:417–46.Google Scholar
  41. 41.
    Rosenbaum PR. Chapter 9.1: Various Practical Issues in Matching: Checking Covariate Balance. In: Design of Observational Studies. New York: Springer; 2010:187–90.Google Scholar
  42. 42.
    Bishop YMM, Fienberg SE, Holland PW. Discrete Multivariate Analysis: Theory and Practice. Cambridge: The MIT Press; 1975.Google Scholar
  43. 43.
    Rosenbaum PR. Sensitivity analysis for m-estimates, tests, and confidence intervals in matched observational studies. Biometrics. 2007;63:456–64. (R package sensitivitymv and sensitivitymw).PubMedCrossRefGoogle Scholar
  44. 44.
    Maritz JS. A note on exact robust confidence intervals for location. Biometrika. 1979;66:163–66.CrossRefGoogle Scholar
  45. 45.
    Rosenbaum PR. Two R packages for sensitivity analysis in observational studies. Obs Stud. 2015;1:1–17.Google Scholar
  46. 46.
    Mantel N. Chi-square tests with one degree of freedom: Extensions of the Mantel-Haenszel procedure. J Am Stat Assoc. 1963;58:690–700.Google Scholar
  47. 47.
    Cleveland WS. Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc. 1979;74:829–36.CrossRefGoogle Scholar
  48. 48.
    Efron B, Tibshirani R. Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Stat Sci. 1986;1:54–75.CrossRefGoogle Scholar

Copyright information

© Society of General Internal Medicine 2019

Authors and Affiliations

  • Jeffrey H. Silber
    • 1
    • 2
    • 3
    • 4
    • 5
    Email author
  • Paul R. Rosenbaum
    • 5
    • 6
  • Bijan A. Niknam
    • 1
  • Richard N. Ross
    • 1
  • Joseph G. Reiter
    • 1
  • Alexander S. Hill
    • 1
  • Lauren L. Hochman
    • 1
  • Sydney E. Brown
    • 3
  • Alexander F. Arriaga
    • 3
    • 7
    • 8
  • Lee A. Fleisher
    • 3
    • 5
    • 8
  1. 1.Center for Outcomes Research, Children’s Hospital of PhiladelphiaPhiladelphiaUSA
  2. 2.TheDepartment of PediatricsThe University of Pennsylvania Perelman School of MedicinePhiladelphiaUSA
  3. 3.Department of Anesthesiology and Critical CareThe University of Pennsylvania Perelman School of MedicinePhiladelphiaUSA
  4. 4.Department of Health Care Management, Wharton SchoolUniversity of PennsylvaniaPhiladelphiaUSA
  5. 5.Leonard Davis Institute of Health EconomicsUniversity of PennsylvaniaPhiladelphiaUSA
  6. 6.Department of Statistics, Wharton SchoolUniversity of PennsylvaniaPhiladelphiaUSA
  7. 7.Department of Anesthesiology, Perioperative, and Pain MedicineBrigham and Women’s HospitalBostonUSA
  8. 8.Center for Perioperative Outcomes Research and TransformationUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations