Background

Documentation of diagnoses is essential for communication between healthcare providers, but it also serves as the basis for risk adjustment in quality measurement and database research. Documented comorbidities can even determine reimbursement in some settings. Past studies have found that Veterans’ comorbidities are more often documented in fee-for-service (FFS) Medicare records compared to records in Veterans Health Administration (VHA) [1,2,3,4].

Although some evidence suggested Veterans were sicker when they received care in non-VHA facilities paid by Medicare [3], differing incentives across systems have been proposed as potential reasons for observed coding discrepancies. Non-VHA hospitals bill public payers (i.e., VHA, Medicare and Medicaid) and private insurance companies for distinct inpatient claims. Diagnoses and procedures included in these claims determine the size of payments so providers in non-VHA hospitals have direct incentives to code comprehensively. While VHA may bill Veterans’ private health insurance for care provided for nonservice-connected conditions [5], VHA facilities are largely funded through federal appropriations on a capitated basis per patient.

Coding practices within VHA, however, may have changed over time. The introduction of the Medical Center Allocation System [6], increased attention to risk-adjusted performance reports [7], and expansion of Clinical Documentation Improvement (CDI) programs may have led to more comprehensive coding [8]. Outside of VHA, efforts to improve documentation and coding practices had been taking place for decades [9, 10], and in 2010, the Affordable Care Act created multiple programs linking risk-adjusted quality measures to payment [11]. In addition, enrollment in private managed care plans grew significantly [12, 13], and risk-adjusted payments from public payers to these plans also incentivize comprehensive documentation [14].

Historically, comparisons of care in VHA and non-VHA facilities have been conducted to ensure that Veterans receive high quality care. Efforts are now underway to compare utilization and quality of care in non-VHA facilities that is paid for and provided on behalf of VHA. The Veterans Access, Choice, and Accountability Act of 2014 (Choice) and more recent Maintaining Internal Systems and Strengthening Integrated Outside Networks Act of 2018 (MISSION) allow some Veterans to receive covered services in non-VHA facilities in the community if VHA care does not meet access or quality standards [15]. Community care accounted for over $28 billion (24%) of the budget for medical care in 2023, so researchers and VHA policymakers seek to evaluate the quality of this purchased care. For fair comparisons between VHA-delivered and VHA-purchased care, systematic differences in documentation of comorbidities need to be recognized and accounted for.

Using a unique dataset linking VHA records to all-payer discharge data from state health agencies, we sought to determine whether comorbidity scores and severity levels associated with Veterans’ admissions varied significantly across settings and payers (including non-VHA hospitals paid by VHA, Medicare, Medicaid, commercial insurance, and other sources) from 2012 to 2017 that covered the period of early VHA-purchased care expansion.

Methods

Study design

This retrospective study of repeated cross-sectional data examined comorbidity scores and severity levels associated with Veteran enrollees’ admissions in VHA and non-VHA hospitals in seven states.

Study sample

Veteran enrollees’ all-payer discharge data were obtained from state health agencies and linked to inpatient records from the VA Corporate Data Warehouse Inpatient Encounter files using personal identifiers. The resulting sample included VHA and non-VHA hospitalizations in seven states (AZ, FL, IL, MA, MO, NY, SC). Medicare Severity Diagnosis Related Group (DRG) information was missing from MO all-payer discharge data in years 2016–2017 so these admissions were excluded. Data for all other states included years 2012–2017. Patient-level sociodemographic characteristics, including age, sex, race/ethnicity, marital status, priority group (reflecting military service, disability, and income), and state of residence were obtained from the Assistant Deputy Under Secretary for Health Enrollment Files and the VA Observational Medical Outcomes Partnership Files. Hospital characteristics including number of staffed beds, academic affiliation, and for-profit status were obtained from the Veterans Integrated Service Network Support Services Center and Centers for Medicare and Medicaid Services (CMS) hospital cost reports.

To minimize selection bias of sicker patients being admitted in either setting, we included only Veterans admitted to both VHA and non-VHA hospitals in the same year for the same major diagnostic category (MDC). We also excluded transfers and readmissions (i.e., admissions that occurred within 30 days of discharge from a prior hospitalization).

Dependent variables

As products of clinical status and coding intensity, the Elixhauser-van Walraven (E-VW) comorbidity score and use of the highest severity level within a DRG family (top DRG) for each admission were the primary outcomes in this study [16, 17]. For each admission, all recorded primary and secondary diagnoses were used to calculate the E-VW score. The E-VW score ranges from -19 to 89, and individual diseases have weights ranging from -7 to + 12. For DRG families with multiple severity levels (e.g., complication or comorbidity, major complication or comorbidity), we specified whether each admission used the highest possible severity level. Admissions with DRGs with only one possible DRG severity level were excluded from the DRG analysis.

Independent variables

The primary predictor of interest was the setting/payer of each Veteran’s admission: 1. VHA hospital, 2. non-VHA hospital covered by Medicare, 3. non-VHA hospital covered by Medicaid, 4. non-VHA hospital covered by commercial insurance, 5. non-VHA hospital covered by VHA, or 6. non-VHA hospital covered by other payers. Calendar year was also a predictor of interest as we wanted to know whether coding practices changed over time. Age, sex, marital status, Veteran priority group, and state were included as patient-specific sociodemographic characteristics. Admission dates were used to generate a categorical admission sequence variable. For each DRG, the major diagnostic category and an indicator of surgery were also included. Finally, number of staffed beds, academic affiliation, and for-profit status were included as hospital-specific characteristics.

Statistical analysis

The unit of analysis was the hospital admission. Using generalized linear models, we included setting/payer, categorical calendar year, and the interaction between these indicators as primary predictors. A Gaussian distribution was used for the E-VW model and a binomial distribution was used for the DRG model. In these models, we adjusted for age, sex, marital status, Veteran priority group (reflecting military service, disability, and income), state, admission sequence, MDC, an indicator of surgery, categorical hospital size [18], academic affiliation, and for-profit status. Linear, quadratic, and cubic terms were included for age and admission sequence to allow for non-linearity. Because each Veteran had multiple hospitalizations, standard errors were adjusted for clustering within patient.

For missing covariates, we carried forward patients’ last observed values if recorded previously. We then carried backward observed values if characteristics were available in later encounters. Remaining observations with missing covariates were removed (1.8% of admissions and 1.3% of Veterans). Statistical analyses were conducted using Stata, version 17 (StataCorp).

For interpretation of the results, we computed the predicted mean E-VW scores and predicted probabilities of using the top DRG for each setting/payer over time. To visually compare temporal trends between VHA and non-VHA hospitals, we plotted the predicted mean E-VW scores and probability of using the top DRG with non-VHA hospitalizations grouped.

Sensitivity analysis

We randomly selected a pair of VHA and non-VHA admissions for each Veteran, as Veterans could have multiple admissions in either setting, and calculated within-patient differences in E-VW comorbidity scores. We then used a generalized linear model to adjust within-patient differences in E-VW scores for admission sequence, year, and MDC. Patient-level demographic characteristics were not included in the model as they were not significant predictors of within-patient differences. Because Veterans could have multiple patient-years in the sample, standard errors were adjusted for clustering within patient.

Results

The sample included 23,594 Veterans (95% male; mean age 64.7) with 60,942 admissions. Approximately half (51%) of admissions were in VHA hospitals followed by non-VHA-hospitals paid by Medicare (21%), VHA (14%), commercial insurance (6%), Medicaid (3%), and other sources (6%). Admissions in non-VHA hospitals paid for by VHA included medical emergencies and cases in which VHA could not provide care based on availability of services or certain access criteria including lengthy distances to a VHA hospital or long waiting times for care. The six most common major diagnostic categories accounted for 80% of all admissions: Circulatory System (31%), Respiratory System (14%), Mental Diseases and Disorders (13%), Alcohol/Drug Use or Induced Mental Disorders (8%), Nervous System (8%), and Digestive System (6%). Most admissions were non-surgical (86%).

Setting/payer and year were significant predictors (p ≤ 0.001) for both E-VW score and use of the top DRG. Medicare was associated with the highest predicted mean E-VW score at 5.71 (95% CI 5.56–5.85) and highest probability of using the top DRG (35.3% (95% CI 34.2%-36.4%)); in contrast the VHA mean comorbidity score was 4.44 (95% CI 4.34–4.55) and probability of using the top DRG was 22.1% (95% CI 21.4%-22.8%). VHA admissions were consistently associated with the lowest mean comorbidity scores and lowest probability of using the most severe DRG levels (Table 1).

Table 1 Predicted mean Elixhauser-van Walraven scores and probability of using the most severe DRG level by setting/payer

Temporal trends across settings/payers, however, were similar with non-significant interactions between setting/payer and year for both metrics (p > 0.1). Across all systems/payers, the mean comorbidity score increased from 4.79 (95% CI 4.63–4.95) in 2012 to 5.18 (95% CI 5.01–5.35) in 2017. The overall probability of using the top DRG increased from 23.8% (95% CI 22.8%-24.9%) in 2012 to 32.8% (95% CI 31.6%-34.0%) in 2017. Overall VHA versus non-VHA trends showed persistent relative undercoding (Fig. 1).

Fig. 1
figure 1

VHA versus non-VHA comorbidity scores and use of top DRG over time. Solid lines reflect Elixhauser-van Walraven scores, and dashed lines reflect the probability of using top DRG (%). VHA, Veterans Health Administration; DRG, Medicare Severity Diagnosis Related Group

In the sensitivity analysis limited to randomly selected pairs of VHA and non-VHA admissions for each patient, the marginal predicted mean within-patient difference in E-VW scores was -0.96 (95% confidence interval -1.05 to -0.87). Calendar year was not a significant predictor of within-patient differences (p = 0.39), which demonstrated persistence of lower E-VW scores over time (Fig. 2). Admission sequence was a significant predictor of within-patient differences (p < 0.001), which were larger when the VHA admission preceded the non-VHA admission. Patient characteristics associated with admissions included in this sample are included in Table 2.

Fig. 2
figure 2

Predicted within-patient differences in Elixhauser-van Walraven scores

Table 2 Patient characteristics associated with randomly selected paired VHA and non-VHA admissions

Discussion

Our goal was to determine whether variation in documentation of comorbidities in VHA and non-VHA admissions had changed over time as the increased emphasis toward performance reporting and growth of CDI programs in VHA may have been influenced coding behaviors. Systematic differences across settings of care are important to understand because documented comorbidities play an important role in risk adjustment for comparisons of quality and performance. Existing studies of VHA care consistently find that quality and safety are as good or better than in other settings [19], but performance advantages may be underestimated if relative undercoding causes VHA-reliant Veterans to appear healthier than peers who use non-VHA care.

Previous studies of documentation of comorbidities have been limited to VHA versus traditional FFS Medicare. Our data is unique as it captured utilization under both FFS Medicare and Medicare Advantage (MA) plans as well as utilization covered by VHA, Medicaid, commercial insurance, and other payers. Previous comparisons also focused on distinct diagnoses and risk scores, and this data includes DRGs which have a direct relationship to hospital reimbursement. A study utilizing the Healthcare Cost and Utilization Project National Inpatient Sample found that use of the highest severity level increased over time for 15 of the top 20 most reimbursed DRG families despite reductions in risk-adjusted mortality [20]. This observed change in coding was associated with $1.2 billion in increased payments. While we found a small increase in E-VW scores and notable changes in use of top DRG severity levels in VHA hospitals over time, differences between VHA and non-VHA hospitals were persistent.

VHA launched a national CDI program in 2013, and half of VHA medical centers had implemented a CDI program by 2016 [21]. VA internal audits in 2016, however, found that correct evaluation and management (E/M) codes were only used in 60% of encounters nationwide [21]. VA officials attributed errors to lack of provider training, lack of emphasis on the importance of accurate coding, and lack of time for careful coding. The establishment and growth of VHA CDI programs likely contributed to the observed increase in E-VW scores and use of the top DRG over time; however, the persistent differences in these metrics between VHA and non-VHA hospitals suggest that different strategies may be necessary to improve uptake of best practices. Physicians are ultimately responsible for entering diagnosis codes and providing supporting documentation, but they may see these activities as distractions from clinical care. Consequently, the importance of buy-in and rapport between physicians and CDI specialists has been highlighted as a crucial factor for success [10]. One study examined the benefit of in-person, verbal communication between CDI specialists and clinicians and found significant improvements in time to resolution [22].

It is also possible that efforts to improve coding behaviors within VHA will not be sufficient to offset the effects of financial incentives on coding intensity outside VHA. To account for well-established differences in coding intensity observed between MA organizations and FFS providers, the CMS applies a coding pattern adjustment, reducing risk-based payments to MA plans by 5.9% [23]. This adjustment is the minimum amount required by the American Taxpayer’s Relief Act of 2012, however, and larger adjustments may be warranted [24]. Methods to estimate an appropriate coding pattern adjustment for MA plans have been proposed, but there is no consensus regarding the best method [25]. Future studies comparing VHA and non-VHA risk-adjusted quality and performance may consider developing risk-score adjustment methods that reduce bias from differences in coding intensity between VHA care and non-VHA care. Compared to other Veterans, those who use VHA have a higher prevalence of chronic physical and mental health conditions [26]. VHA-reliant reliant also have lower self-reported health [27]. Consequently, such an adjustment would likely be conservative.

Limitations

Our study was limited to seven states with data spanning 2012 to 2017, so these findings may not be representative of current admissions on a national level. Although we adjusted for admission sequence to account for clinical progression, Veterans may receive non-VHA care when they are sicker. Of the Veterans admitted twice in the same year, 15–17% were admitted to both VHA and non-VHA hospitals. The study sample may not be generalizable to the general populations of Veterans with hospitalizations; however, restricting the sample to Veterans admitted in both settings under the same major diagnostic category allowed us to remove some selection bias. Finally, we did not have access to outpatient diagnoses. Billing for outpatient E/M services, however, creates similar incentives for non-VHA clinicians to code diagnoses intensively as the number and complexity of problems affects the level of service and reimbursement.

Conclusion

These data suggest that relative undercoding has persisted in VHA and highlight differences between VHA hospitals that are funded as part of an integrated delivery system and non-VHA hospitals that may document comorbidity for payment. While risk scores and DRG severity increased in VHA over time, the difference between VHA and non-VHA hospitals was persistent, suggesting that internal efforts to improve documentation may not be sufficient. Future studies may need to apply alternative strategies, including coding pattern adjustments, to ensure fair comparisons of quality and performance.