Understanding the biases to sepsis surveillance and quality assurance caused by inaccurate coding in administrative health data

Purpose Timely and accurate data on the epidemiology of sepsis are essential to inform policy decisions and research priorities. We aimed to investigate the validity of inpatient administrative health data (IAHD) for surveillance and quality assurance of sepsis care. Methods We conducted a retrospective validation study in a disproportional stratified random sample of 10,334 inpatient cases of age ≥ 15 years treated in 2015–2017 in ten German hospitals. The accuracy of coding of sepsis and risk factors for mortality in IAHD was assessed compared to reference standard diagnoses obtained by a chart review. Hospital-level risk-adjusted mortality of sepsis as calculated from IAHD information was compared to mortality calculated from chart review information. Results ICD-coding of sepsis in IAHD showed high positive predictive value (76.9–85.7% depending on sepsis definition), but low sensitivity (26.8–38%), which led to an underestimation of sepsis incidence (1.4% vs. 3.3% for severe sepsis-1). Not naming sepsis in the chart was strongly associated with under-coding of sepsis. The frequency of correctly naming sepsis and ICD-coding of sepsis varied strongly between hospitals (range of sensitivity of naming: 29–71.7%, of ICD-diagnosis: 10.7–58.5%). Risk-adjusted mortality of sepsis per hospital calculated from coding in IAHD showed no substantial correlation to reference standard risk-adjusted mortality (r = 0.09). Conclusion Due to the under-coding of sepsis in IAHD, previous epidemiological studies underestimated the burden of sepsis in Germany. There is a large variability between hospitals in accuracy of diagnosing and coding of sepsis. Therefore, IAHD alone is not suited to assess quality of sepsis care. Supplementary Information The online version contains supplementary material available at 10.1007/s15010-023-02091-y.

1 Supplemental Methods

Sampling strategy
A sample of 1,200 hospital episodes per hospital was drawn by the study's epidemiologist.Since the rate of sepsis among hospital cases was estimated at only about 2% [1][2][3], a disproportional stratified sampling was done to increase the proportion of "true" sepsis cases in the sample.The strata were defined by the cross-tabulation of the following criteria: a) presence of a procedure code (Operationen-und Prozedurenschlüssel; OPS) for complex intensive care treatment (yes vs. no, OPS-code: 8-890); b) hospital length of stay (≤ 6 days vs. > 6 days), c) year of discharge (2015 to 2017).The same number of cases was sampled from each of the resulting 12 strata.The strata were chosen based on the experiences from a single-centre pilot study, where the rate of cases with sepsis-1 in a sample obtained by this method was 16% [2].

Sample size calculation
The sample size was calculated regarding the primary endpoint of the study -the sensitivity of the coding of sepsis-1 with organ dysfunction (ICD-10-GM codes R65.1 and R57.2) in IAHD.In the pilot study the sensitivity was estimated to be 0.39 [2].To estimate sensitivity with a 95% confidence interval of width ±0.03 a sample of number of about 850 true sepsis cases with organ dysfunction according to judgement of record data are necessary.In our pilot study, the rate of such cases in the sample was 8.6% resulting in necessary total sample of number of about 10,000 hospital episodes.

Linkage
IAHD were pseudonymized within the participating hospitals; hospitals kept a list linking internal case identification numbers with the pseudonym to identify the randomly selected charts.To assure correct linkage patients' age, gender, exact time of admission and discharge obtained from IAHD were provided in the list of selected cases.The pseudonym was used to identify the cases of the validation sample within the eCRF and thereby to link the data from medical records with the information obtained from the IAHD.The linkage was conducted by the study's epidemiologist at the Jena University Hospital.The quality of the linkage was evaluated by comparing demographic information between the IAHD and the eCRF-data.

Training and assessment of interrater agreement
Cases with sepsis in the validation sample were identified by trained study physicians in a chart review conducted in the respective study centre.Study physicians worked in the respective hospitals and were either examined intensivist or supervised by an examined intensivist.For purpose of training of study physicians to identify cases with sepsis from medical records, 40 cases were sampled per study centre including 20 cases with coded sepsis with organ dysfunction or septic shock (ICD-10-GM codes R65.1 or R57.2), 10 cases with coding of any other infection, and 10 cases without any infection code.Based on a written working instruction and a training session with the coordinating study physician, all local study physicians (at least two necessary) of the centres reviewed and discussed every of the 40 cases, of which five were monitored by the coordinating study physician.After the training, a second sample of 40 cases was provided to assess the objectivity of the chart review process.These cases were reviewed independently by two trained study physicians and information on sepsis criteria was documented in an eCRF.Interrater agreement was calculated by Gwet's AC1, a robust alternative to Cohen's κ [4].The target value for sufficiently good agreement was set to > 0.6 [5].A high agreement was found both for identification of sepsis-1 with organ dysfunction (AC1 = 0.89, 95% CI: 0.83 -0.94), as well as sepsis-3 (AC1 = 0.87, 95% CI: 0.82 -0.93); the target value was surpassed in all study centers.

Data cleaning
Since the IAHD are used for billing, they are highly standardized and hospitals invest efforts to guarantee the correctness of the documented data.Pseudonymization was done within the participating hospitals using the 3M TM Cryptowizard -a standalone software, which allows a user-friendly pseudonymization of the IAHD in a point-and-click interface.
The eCRF included several methods to foster correct documentation of data: most items included a separate category to indicate missing information ("unknown"); conditional rules were used to implement nested items, a data manager checked the completeness of documentation and managed queries together with the local study nurses.To guarantee correct documentation of sepsis criteria, an active feedback was implemented in the eCRF: after the criteria were documented by the study physician, the eCRF presented, which sepsis categories would apply to the case based on the documentation.The study physician then had to actively confirm this categorization or could correct the documentation of sepsis criteria if any inconsistencies were apparent.The technical aspects of the eCRF were extensively pre-tested before the main study was conducted.Cleaning and data preparation were conducted by the study's epidemiologist; "unknown" categories of variables were set to missing values.

Statistical analysis using survey methods
The R-package survey was used to calculate relative frequencies and logistic regressions for complex data [6].Missing values due to lacking information in medical records were treated by missing-data adjusted sampling weights to prevent bias by over-or underrepresentation of strata [7].Classification trees were calculated using the R-package rpart, which also allows to take sampling weights into account [8].Weighted correlations between variables obtained from chart review with variables obtained from IAHD (comorbidity indices, predicted risk from risk-models) were calculated using the R package jtools [9].The bivariate relations between these variables were visualized by contour plots, which were created using two-dimensional weighted kerneldensity estimates via the package ks [10].

Definition of sepsis for the chart review
In the following, we present the CRF items for the documentation of sepsis during the chart review as well as the working instruction.The complete CRF has been published along with the study protocol [11].CRF items are given in sans serif font with red colour text presenting a filter criterion and blue colour text presenting a multiple-choice option. 3 clinically suspected (increased infection levels, fever) Document the degree of confirmation for the infection.If more than one infection was present, refer to the infection, which has most probably caused sepsis.If it is not possible to decide, which infection caused sepsis, than report the highest degree of confirmation.

Definition of infection
• An infection shall be regarded as microbiologically proven, if a relevant infectious agent was proven within an appropriate timeframe (collection within 24 hours before and 24 hours after onset of infection or ICU admission) and has a causal link to the infection (allocation to a source of infection, no exogenous contamination of the sample, typical pathogen spectrum).• Other confirmations of infection can be radiological findings with clinical symptoms, a surgical source control, abnormal urine status, or comparable findings.• Infections are considered clinically suspected in patients when only nonspecific or indirect evidence is present, such as elevated infection levels and/or fever.Please document data of infection-related SIRS criteria only for the simultaneous occurrence of at least 2 criteria within a 24-hour time window.They should only be checked if the SIRS criteria exist due to infection.

Definition of systemic inflammatory response syndrome (SIRS) according to sepsis-1 definition
• Tachycardia counts for a heart rate ≥90/min, which cannot be explained by other clinical causes (for exampleo tachycardia due to volume deficiency, which is regressive after volume substitution) • Tachypnea is defined as a respiratory rate ≥20/min or an arterial paCO2 ≤4.3 kPa/33 mmHg.Ventilation includes any form of controlled or assisted ventilation.The only exceptions are the application of CPAP (continuous positive airway pressure) or NIV (noninvasive ventilation) for respiratory exercise.• Leukocytosis/leukopenia/left shift means a leukocyte count ≥12000/µl or ≤4000/µl or more than 10% immature neutrophil granulocytes in the differential blood count.• The core body temperature is to be used to indicate the temperature.Core temperature can be measured rectally, sublingually, via a central catheter, bladder catheter or tympanitically.When measuring an axillary temperature, 0.5°C is added to the measured value.

if 1a = yes: infection-related organ dysfunction
a.After the onset of infection, did criteria referring to a new onset of infection-related organ dysfunction occur?

arterial hypotension (confirmation of infection and > 1 h systolic arterial BP ≤90 mmHg or MAP ≤70 mmHg or vasopressor administration to
maintain target systolic BP of ≥90 mmHg or MAP ≥70mmHg; despite adequate volume resuscitation and not explainable by other causes) BP: blood pressure, MAP: mean arterial pressure Please indicate any new onset of infection-related organ dysfunction or significant worsening of pre-existing organ dysfunction.All organ dysfunctions that already existed at the time of onset of infection, severe sepsis or septic shock and are attributable to another cause are not documented here (e.g.chronic kidney failure in diabetes mellitus or thrombocytopenia after trauma).Unknown is to be selected if values are unknown or have not been collected.

Acute encephalopathy
Impaired vigilance, disorientation, agitation or delirium as a result of infection must be documented.If the patient's vigilance or orientation is reduced due to other causes and/or if the patient is sedated, indicate "no" for this organ dysfunction.

Thrombocytopenia
If the platelet count is reduced due to an underlying disease, chemotherapy, or an immunological cause, or if it is a consequence of acute bleeding, this organ dysfunction is "no".If the platelet count is significantly worsened by the severe sepsis (30% reduction in platelet count), septic organ dysfunction is present.

Arterial hypoxemia
Manifest heart or lung disease must be excluded as a cause.The lowest oxygenation index (= Horovitz quotient, the worst PaO2/FiO2 ratio) is asked for.In case of continuous documentation of the oxygenation index (patient with ventilation), please indicate the lowest value.In case of automatic calculation or data transfer from blood gas analyzer and/or ventilator, ensure that only arterial (no venous) blood gas analyses are used and that the current FiO2 at the time of sampling is taken into account.If no documentation of the oxygenation index is available, please always use the values of an arterial (capillary) blood gas analysis and the FiO2 from the same point in time to calculate the oxygenation index.

Calculation of oxygenation index/Horovitz quotient:
To do this, one must determine the arterial partial pressure of oxygen (PaO2) in the blood by means of a blood gas analysis, for example, and divide this by the inspiratory oxygen concentration, i.e. the oxygen concentration of the inhaled air (FiO2).

FiO2
Note that the PaO2 in kPa must first be converted to mmHg to determine the oxygenation index.Use the factor 7.5 for the conversion.
If arterial (capillary) blood gas analysis is not available, oxygen saturation SpO2 and oxygen delivery can be used to calculate the oxygenation index.Use the following two conversion tables to determine the calculated PaO2 and estimated FiO2.

Renal dysfunction
A diuresis of ≤0.5 ml/kg/h for at least 2 h despite adequate volume substitution and/or a more than twofold increase in serum creatinine above the locally usual reference range is considered as renal organ dysfunction.If creatinine is chronically elevated or the patient had a pre-existing dependence on renal replacement therapy, indicate "no" for this organ dysfunction unless there is a significant deterioration in renal function with a decrease in self-diuresis below the specified value of ≤0.5 ml/kg/h for at least 2 h despite adequate volume substitution.

Metabolic acidosis
Metabolic acidosis is defined by a base excess of ≤ -5 mmol/l or a lactate concentration, which is >1.5 times above the locally usual reference range.If acidosis is present due to other respiratory or metabolic causes, indicate "no" for this organ dysfunction.

Arterial hypotension (septic shock according to sepsis-1 definition)
If arterial hypotension exists with systolic blood pressure ≤90 mmHg or mean arterial pressure of ≤70 mmHg for at least 1 hour or if administration of vasopressors (dopamine at least 5µg/kg/min, epinephrine, norepinephrine, phenylephrine, or vasopressin in any dose) is required to maintain systolic blood pressure of at least 90 mmHg or mean arterial pressure at least 70 mmHg, indicate "yes."Note that adequate hydration was provided and other causes of shock were excluded.
To determine the estimated FiO2, the oxygen delivery is determined either via a nasal probe, nasopharyngeal catheter, or face mask in l/min and the estimated FiO2 is read in the table.2.0-5.9

Presence of organ dysfunction according to sepsis-3 definition
(33-101) 6.0-11.92.0-5.9 (33-101) 6.0-11.9Check whether there was an infection-related increase in SOFA score of at least 2 points at any time during the stay after the onset of infection.Document the SOFA score for the time BEFORE the onset of the first infectionrelated SOFA increase by ≥2 points and also AFTER the first infection-related SOFA increase ≥ 2 points.For nervous system assessment, please use the Glasgow Coma Scale.

if 1a = yes: infection-related criteria for septic shock (sepsis-3)
a.After onset of infection: Did septic shock criteria according to sepsis-3 were present simultaneously?
(increase in serum lactate to > 2mmol/l; persistent hypotension demanding vasopressor administration to maintain MAP ≥65 mmHg) The criteria must have been present simultaneously (in a 24 h interval).For the increase in lactate, only an infection-related increase to >2mmol/l should be reported.When assessing hypotension with vasopressor therapy, it is important to note that adequate hydration occurred and other causes of shock were excluded.If information evaluating one or both criteria is not available in the record, indicate as "not measured/unknown."If one of the two values was normal (no hypotension or no increase in serum lactate) and the other value is missing, then the category "None of the criteria was present" should be checked (for explanation: if one criterion was certainly not present, then the criterion for shock is not fulfilled in any case -whether the information on the other criterion is missing does not matter).

Definition of variables in administrative health data 3.1 Explanation
Analyses were based on data in the format according to §21-KHEntgG, which defines the format of data used for billing of hospitals in the German DRG-system.These data provide information in different data tables in csvfiles.The following data tables were used: a) "FALL": general information on the case, b) "ICD": information on codes according to the German Modification of the ICD-10, c) "OPS": information on the codes for surgeries and procedures in Germany ("Operationen-und Prozedurenschlüssel").The following sections describe the definition of variables, which ware based on these different data tables, as well as variables, which were calculated using previously defined variables.

Variables based on ICD-codes
The following variables were defined based on the ICD-codes in data

Variables based on OPS-codes
The following variables were defined based on the OPS-codes in data table "OPS".Each variable was defined as an indicator variable (0: condition not present, 1: condition present), if at least one of the listed OPS-codes was present.If OPS-codes do not present the full number of possible characters, all subordinate OPS-codes are included.3.5 Variables calculated using previously defined variables.
The following variables were calculated using previously defined variables given above.Definitions use the syntax of the statistical software R.

Design
A retrospective observational study was conducted based on national IAHD to assess sepsis incidence in Germany for the year 2017.

Setting
In Germany, hospitals are reimbursed based on a diagnosis related groups (DRG) system.Every year a standardized data set is transferred to the federal Institute of Hospital Reimbursement (Institut für das Entgeltsystem im Krankenhaus; InEK) by every hospital providing acute care (legal base: §21 KHEntgG).These data are passed to the Federal Bureau of Statistics and can be analysed for research purposes [12].

Sample
The same inclusion criteria as described for the validation study (inpatient cases, DRG-billing, age of at least 15 years) are applied to German national IAHD of the year 2017.

Procedure
National IAHD are hosted by the Federal Bureau of statistics and can be accessed via a form of remote data processing.Based on completely anonymized sample data files, statistical syntaxes are written and sent to the Federal Bureau where they are applied to the original data files.Output files are then transferred back to the researcher.

Statistical analysis
Sepsis incidencewas calculated by obtaining the number of hospital episodes with ICD-10 coded severe sepsis-1 (ICD-10-GM codes R65.1 or R57.2) and dividing them by the size of the German population with age ≥ 15 years within the same year.The size of the population was obtained from the GENESIS data base, which is also provided by the Federal Bureau of Statistics.

Results
Overall, 17,088,417 inpatient, DRG-billed hospital episodes of patients ≥ 15 years of age were documented in the national DRG-statistics for 2017.Explicit ICD-10-GM codes for severe sepsis-1 were present in 148,288 (0.87%) of these cases.Based on the GENESIS data-base, Germany had 71.6 million inhabitants aged 15 years or older in 2017.This led to an estimate for the incidence of severe sepsis-1 of 207/100,000 inhabitants in this age spectrum.The hospital mortality of coded severe sepsis-1 was 40.3% (N = 59,792 deaths).

SFig. 1 Study flow chart.
IAHD: inpatient administrative health data.Explanation on conduction of chart review: N = 1,000 cases should be documented per hospital; N = 1,200 cases were sampled in case of unavailable charts; to assure representativeness and avoid bias by learning effects, the review of charts was conducted in random order.

SFig. 2 Predictive accuracy of explicit codes for infection and sepsis in inpatient administrative health data.
Estimates adjusted for sampling weights and clustering.P-values obtained by Rao-Scott Pearson χ2-Test with satterthwaite approximation.Estimates are presented as relative frequencies (%) along with their 95% confidence intervals and were calculated with adjustment for sampling weights and clustering.
STable 3 Accuracy of identification of cases with severe sepsis-1 by indirect coding abstraction strategies.Estimates are presented as relative frequencies (%) along with their 95% confidence intervals and were calculated with adjustment for sampling weights and clustering.
If 4a = yes: list the SOFA values referring to the respective organ systems for the time point AFTER the first infection-related SOFA-score increase of at least 2 pt.

4. if 1a = yes: infection-related increase in SOFA score ≥2 points
If 4a = yes: list the SOFA values referring to the respective organ systems for the timepoint PRIOR TO the first infection-rated SOFA score increase of at least 2 pt.
[3]3]n national IAHD have been used previously to calculate yearly incidence proportions of sepsis[1,3].For the year 2015, Fleischmann et al. identified 136,542 cases with ICD-codes for severe sepsis-1 among all hospitalizations represented in the national DRG-statistics, corresponding to an incidence of 158 per 100,000 inhabitants[3].