Background

Healthcare-associated infections (HAIs) pose a major burden on the healthcare system, and result in increased morbidity, mortality, prolonged hospital stay, and additional costs [1,2,3]. HAIs yearly affect nearly four million patients in acute care hospitals in Europe and surgical site infections (SSIs) account for around 18% of all HAIs, annually affecting more than 500,000 patients [3]. After colorectal surgery, up to or more than 30% of patients develop an SSI [4].

Continuous surveillance with feedback to healthcare personnel and stakeholders is essential to allocate the required resources and assess the effect of interventions to prevent HAIs. Traditional HAI surveillance is often based on time-consuming and resource-intensive manual review of patient records, which is also prone to subjective interpretation and surveillance bias [5,6,7]. Automated surveillance methods that re-use electronic health record (EHR) data are being developed and considered an attractive alternative to this manual surveillance as it will reduce workload and generates standardised and continuous surveillance results [6, 7]. However, surveillance algorithms need to be thoroughly validated before being implemented in a clinical setting. Their transferability to other countries with different EHR systems and data management than the country of development needs to be assessed before implementation in new settings.

In this study, the aim was to externally validate two existing semi-automated surveillance algorithms for deep SSI after colorectal surgery, developed based on Spanish and Dutch data [8, 9], in a Swedish setting.

Methods

This retrospective study used EHR data from the Karolinska University Hospital (KUH) stored in the 2SPARE (2020 started Stockholm/Sweden Proactive Adverse Events REsearch) database. KUH is a tertiary care academic center with 1,100 beds divided between two hospitals (Huddinge and Solna), which serves the population of Region Stockholm (2.3 million inhabitants). The study was approved by the Regional Ethical Review Board in Stockholm (no. 2018/1030-31).

With semi-automated surveillance patients are divided in low- and high-probability cases where low-probability cases are automatically regarded as no SSI while high-probability cases undergo manual record review to determine SSI status [6]. Two existing semi-automated classification algorithms to assess deep SSI and/or organ/space SSI, from here on together referred to as deep SSI, were validated (Fig. 1):

  1. 1.

    Original classification algorithm of van Rooden et al., i.e., probability of having a deep SSI based on (re)admission and discharge data, mortality, reoperations, radiology orders, antibiotic prescriptions, and microbiology cultures [8];

  2. 2.

    Adapted classification algorithm according to Verberk et al., i.e., probability of having a deep SSI based on the original classification algorithm without the microbiology component [9].

The algorithms’ performance was assessed in a validation cohort of 225 colorectal surgeries selected via simple random sampling from the 2,675 performed surgeries in the period January 1, 2015 until August 31, 2020 (Fig. 1). Patient and surgery characteristics were recorded. The outcome of interest and gold standard was deep SSI versus no deep SSI (no SSI or only superficial SSI) within 30 days after colorectal surgery as annotated by two experienced infection control practitioners (ICPs) according to the European Centre for Disease Prevention and Control (ECDC) SSI definitions and guidelines [10]. Twenty cases were reviewed in overlap resulting in almost perfect agreement (95%) with a Cohen’s kappa of 0.87 for SSI classification. Both IPCs were blinded for the algorithm results.

Fig. 1
figure 1

Flow chart of study and flow diagram of classification algorithms for deep surgical site infection. (SSI: surgical site infection; ECDC: European Centre for Disease Prevention and Control. Admissions: Length of stay ≥ 14 days or 1 readmission to original department or in-hospital mortality within follow-up (FU) time (= 45 days after surgery). Reoperations: Any reoperation by original surgery specialty within FU time. Radiology: ≥1 orders for CT scan within FU time. Antibiotics: ≥3 consecutive days of antibiotics (ATC J01) within FU time, starting after day 1. Microbiology: ≥1 culture taken from relevant body sites within FU time, excluding cultures taken on any day prior to day 1. Original classification algorithm: Figure originally published in van Rooden et al. [8], adapted and used with permission. Adapted classification algorithm: Figure originally published in Verberk et al. [9], adapted and used with permission)

Data acquisition, management and analysis were performed using R statistical software (version 3.6.1) and Python (version 3.7), and in accordance with current regulations concerning privacy and ethics. For algorithm performance, the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were assessed. The confidence interval (CI) for these estimates were calculated using the asymptotic variance with Wilson score method.

Results

Within the validation cohort the median age was 66 year (IQR 55–75), 48.9% (n = 110) were female and the median body mass index was 25.6 (IQR 22.3–29.4). Most surgeries were primary (63.6%, n = 143), open procedures (77.3%, n = 174) and related to malignancy (76.9%, n = 173). The surgeries had a median duration of 316 min (IQR 206–427) and in 41.8% (n = 94) of surgeries a stoma was created. In 36.5% (n = 82) of surgeries the patient had an ASA class ≥ 3 and 24.0% (n = 54) of surgeries had a contaminated or dirty-infected wound classification. Within 30-days after surgery 18.2% (n = 41) of patients developed a deep SSI and 2.2% (n = 5) of the patients died.

Both semi-automated algorithms were applied to the validation cohort. Ordered radiology (47.6%, n = 107) and receiving antibiotic therapy for ≥ 3 consecutive days (41.3%, n = 93) were the most common components. In 38.7% (n = 87) of surgeries ≥ 14 days length-of-stay, readmission and/or mortality was present and in 31.1% (n = 70) a microbiological culture was taken. A reoperation (17.8%, n = 40) was the least common component.

Of the 41 patients with a deep SSI, 34 were classified as high probability by the original algorithm (sensitivity 82.9, 95%CI: 68.7–91.5) while 40 were classified as high probability by the adapted algorithm (sensitivity 97.6, 95%CI: 87.4–99.6) (Table 1). The six deep SSI cases only missed by the original algorithm all scored 2–3 components, but were lacking the microbiology component (in other words, no cultures were obtained). The one deep SSI case missed by both algorithms had none of the algorithm components: this deep SSI was manually assessed based on a clinical note describing pus from the rectal stump. The algorithms would lead to workload reduction for manual surveillance of 74% and 57%, respectively.

Table 1 Performance classification algorithms for deep surgical site infections after colorectal surgery according to ECDC definitions

Discussion

External validation of two existing semi-automated surveillance algorithms after colorectal surgery in EHR data of a Swedish academic hospital center showed that the adapted classification algorithm, indicating high SSI probability based on (re)admission and discharge dates, mortality, reoperations, radiology orders and antibiotic prescriptions, performed best and outperformed the original classification algorithm which also included microbiology culture data. Only one mild case of deep SSI was missed, which would be hard to detect using only structured data as it was assessed merely based on a free-text medical note.

The original algorithm using microbiology culture data was less useful for this specific clinical setting as microbiological culture practices are not standard in suspected deep SSIs after colorectal surgery. However, this might be different in other settings and highlights the importance of thoroughly validation before implementation and pre-emptively investigating if algorithms correspond with clinical practices [9]. It should be emphasised that checking the algorithm periodically against clinical practice after implementation also remains important [6, 7, 9].

Although the original semi-automated classification algorithm was developed based on Spanish and Dutch data, and validated and adapted in the Netherlands, also within Sweden the sensitivity was high and comparable with previous results [8, 9]. These results confirm the potential of large-scale implementation of both, where especially the adapted algorithm, without microbiology data, has demonstrated robustness within Europe.

Strengths of our study were the independent external validation with the extensive availability of EHR data through which the performance of epidemiological surveillance using real-world, real-time data could be mimicked. Limitations were the usage of data from only one center in Sweden, focus of algorithms on only deep SSI and that absence of active post-discharge surveillance of SSI could result in missed SSI.

In conclusion, the results from this study in Sweden, in conjunction with previous studies in the Netherlands and Spain, indicate that a classification algorithm based on (re)admission and discharge dates, mortality, reoperations, radiology orders and antibiotic prescriptions, could be widely implemented for semi-automated surveillance of SSI after colorectal surgery.