Health Care Management Science

, Volume 22, Issue 2, pp 364–375 | Cite as

Bayesian logistic regression approaches to predict incorrect DRG assignment

  • Mani SuleimanEmail author
  • Haydar Demirhan
  • Leanne Boyd
  • Federico Girosi
  • Vural Aksakalli


Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode’s probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.


DRGs Bayesian Analysis Health Informatics Clinical Coding Statistical Modelling 



This research is funded by the Capital Markets Cooperative Research Centre (CMCRC) Limited. We would like to express our gratitude to the anonymous reviewers whose suggestions improved the quality and clarity of the manuscript.

Compliance with ethical standards

Conflicts of interest

No potential conflicts of interest exist for all authors.


  1. 1.
    Chen M, Ibrahim J, Kim S (2008) Properties and Implementation of Jef- freys’s Prior in Binomial Regression Models. Journal of the American Statistical Association 103(484):1659–1664CrossRefGoogle Scholar
  2. 2.
    Demirhan H (2013) Bayesian estimation of order-restricted and unrestricted associ- ation models. Journal of Multivariate Analysis 121:109–126CrossRefGoogle Scholar
  3. 3.
    Demirhan H, Hamurkaroglu C (2011) On a multivariate log-gamma distribu- tion and the use of the distribution in Bayesian analysis. Journal of Statistical Planning and Inference 141(3):1141–1152CrossRefGoogle Scholar
  4. 4.
    Duckett S (2015) The Australian health care system. Oxford University Press. In: fifth editionGoogle Scholar
  5. 5.
    Firth D (1993) Bias reduction of maximum likelihood estimates. Biometrika 80(1):27–38CrossRefGoogle Scholar
  6. 6.
    Gelman A, Jakulin A, Pittau M, Su Y (2008) A weakly informative default prior distribution for logistic and other regression models. Ann. Appl. Stat. 2(4):1360–1383CrossRefGoogle Scholar
  7. 7.
    Hanson T, Branscum A, Johnson W (2014) Informative g-Priors for Logistic Regression. Bayesian Analysis, 9(3):597âA˘ S¸:612Google Scholar
  8. 8.
    Held L, Sauter R (2017) Adaptive Prior Weighting in Generalized Regression. Biometrics 73:242–251CrossRefGoogle Scholar
  9. 9.
    Independent Hospital Pricing Authority. AR-DRG classification system., 2017. [Accessed 2-November-2017]
  10. 10.
    H (1946) Jeffreys. An Invariant Form for the Prior Probability in Estimation Problems. Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences 186(1007):453–461Google Scholar
  11. 11.
    Z. Kalaylioglu and H. Demirhan. A joint Bayesian approach for the analysis of response measured at a primary endpoint and longitudinal measurements. Statistical Methods in Medical Research, 0(0):1–15, 2015Google Scholar
  12. 12.
    Kosmidis I, Firth D (2009) Bias Reduction in Exponential Family Nonlinear Models. Biometrika 96(4):793–804CrossRefGoogle Scholar
  13. 13.
  14. 14.
    Lally N (2015) The Informative g-Prior vs. University of Connecticut, Common Reference Priors for Binomial Regression With an Application to Hurricane Electrical Utility Asset Damage Prediction. Master’s thesisGoogle Scholar
  15. 15.
    M. Luo and M. Gallagher. Unsupervised DRG Upcoding Detection in Healthcare Databases. luo_gallagher_unsupervised_drg.pdf, 2010. 2010 I.E. International Con- ference on Data Mining Workshops
  16. 16.
    Mcnair P, Borovnicar D, Jackson T, Gillet S (2009) Prospective Payment to Encourage System Wide Quality Improvement. Medical Care 47(3):272–278CrossRefGoogle Scholar
  17. 17.
    Rainey C (2016) Dealing with Separation in Logistic Regression Models. Political Analysis 24:339–355CrossRefGoogle Scholar
  18. 18.
    Rosenberg M, Fryback D, Katz D (2000) A Statistical Model to Detect DRG Upcoding. Health Services and Outcomes Research Methodology 1(3–4):233–252CrossRefGoogle Scholar
  19. 19.
  20. 20.
    Zellner A (1983) Applications of Bayesian Analysis in Econometrics. Journal of the Royal Statistical Society. Series D (The Statistician) 32(1/2):23–34Google Scholar
  21. 21.
    Zorn C (2005) A solution to separation in binary response models. Political Analysis 13(2):157–170CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.RMIT UniversityMelbourneAustralia
  2. 2.Capital Markets CRC LimitedSydneyAustralia
  3. 3.RMIT UniversityMelbourneAustralia
  4. 4.Cabrini InstituteMalvernAustralia

Personalised recommendations