Background

Surgical site infection (SSI) impose an important disease burden, being the third most common type of HAI in the last European Centre for Disease Prevention and Control (ECDC) Point Prevalence Study (PPS) (SSI: 16.6% EU/EEA and 17.5% Portugal) [1]. Last report from the ECDC HAI-Net SSI program estimated 1.2% SSI in hip surgery and 0.6% SSI in knee surgery at EU/EEA level [2]. SSI after orthopaedic surgery is also an important cause of healthcare costs, mainly driven by the extra bed-days during readmission [3].

Surveillance and feedback are one of the core components of infection prevention and control (IPC) programs [4, 5]. Nevertheless, in the 2017 ECDC PPS only 52% of the 125 Portuguese hospitals included were participating in a national SSI surveillance network [6]. HAI surveillance can be cumbersome and requires substantial resources, particularly when it is done fully manually, through chart review and ascertainment [7]. Results from the last ECDC PPS reports that only 35.7% of EU/EEA hospitals use some type of surveillance automation [1].

Automated surveillance (AS) may reduce surveillance manual workload by using routine care data from electronic health records (EHR) to identify patients at high risk for HAI. Besides reducing time and resources allocated to chart review and data collection, automated surveillance increases reliability and reproducibility of data by decreasing inter-individual variability in case classification and surveillance bias; and allows almost real-time analysis and reporting, and facilitates quality improvement [8,9,10,11,12,13]. In semi-automated surveillance, clinician’s involvement in HAI confirmation may increase their acceptability of surveillance results, when compared to fully automated surveillance [9,10,11,12,13].

AS in circumstances with low HAI incidence, such as SSI after orthopaedic surgery, can be particularly cost-effective, since it is required to review a high number of records to identify a small number of patients with SSI [13, 14]. Several authors have developed algorithms for SSI surveillance in orthopaedic surgery based on EHR reporting sensitivity between 83.3% and 100%, and manual workload reduction between 90.8% and 98% [14,15,16,17,18,19].

This study aimed to design and validate different semi-automated algorithms based on data from EHRs, estimate their sensitivity to identify patients at high risk for SSI after hip or knee arthroplasty and determine the workload reduction associated with their application in routine surveillance.

Methods

Study setting and design

Unidade Local de Saúde São João (ULSSJ) includes a tertiary hospital in the north of Portugal that performs around 600 knee and hip surgeries per year. Between 2015 and 2018, ULSSJ participated in the national project “STOP Infeção Hospitalar” (STOP hospital-acquired infection) aimed to reduce HAI, including SSI [20]. During this period, SSI surveillance in hip and knee arthroplasty was fully manual and performed by orthopaedic surgeons. A paper form was completed for all surgical procedures, including data about the surgical procedure, presence of SSI and its type (superficial, deep, organ/space), according to the 2012 ECDC HAI-Net SSI 1.02 protocol [21]. In a second phase, individual patient-level data was manually inserted in a dedicated electronic case report form and stored in a central national database. Due to the excessive workload associated with this manual process and the disruption of healthcare services imposed by the pandemic, since 2018 SSI surveillance in ULSSJ has been on hold. This study used the retrospective data collected during the 2015–2018 manual surveillance period as the gold standard for algorithms validation.

Participants

All elective and urgent knee and hip arthroplasties between May 2015 and December 2017 were included. Surgical procedures were identified in the EHRs using the ICD-9 classification recommended by the ECDC HAI-NET SSI protocol: hip arthroplasty 00.70–00.73, 00.85–00.87, 81.51–81.53; knee arthroplasty 00.80–00.84, 81.54–81.55. Surgeries with these codes in the main procedures were included, regardless of being original primary or reintervention surgery. Patients were followed up for 90 days after surgery.

Algorithms definition

Initially, six variables were selected based on an algorithm previously published and validated [14, 18, 19], in which some variable cut-offs were adjusted taking into consideration ULSSJ clinical practice. The variables included were:

  1. 1.

    One or more positive microbiological results.

  2. 2.

    Three or more microbiological samples collected (regardless of the results).

  3. 3.

    Seven or more days of in-hospital antimicrobial therapy.

  4. 4.

    Length of hospital stay ≥ 14 days during index surgery admission.

  5. 5.

    Hospital readmission in the Orthopaedics department.

  6. 6.

    Orthopaedic surgery reintervention.

In a first phase, these six variables were grouped into four categories, according to the algorithm previously published, where patients fulfilling criteria in at least three categories were considered at high risk of SSI (Fig. 1 – Algorithm A) [14, 18, 19]. In a second phase, adapted algorithms were tested, based on discrepancy analysis (Fig. 1 – Algorithm B, C and D). In a third phase, different combinations of seven variables (including emergency room attendance) were tested, where the presence of any of them in the 90 days after surgical procedure was considered high risk for SSI.

Fig. 1
figure 1

Algorithms tested in the first and second phases (A, B, C and D)

Acronyms: ATB – antimicrobial therapy; LOS – length of hospital stay; SSI – surgical site infection. Algorithm A was based on an algorithm previously published [14, 18, 19]

Data extraction from EHR and data linkage

Data for sample characterization and algorithm validation were extracted from EHRs by the Data Intelligence Service. Variables included were: demographic data (date of birth, sex), hospital admission (date of admission, department responsible for admission), surgical intervention (ICD-9 code of orthopaedic intervention, date of intervention, urgent intervention, orthopaedic reintervention), inpatient antibiotic therapy (duration), microbiological results (positive bacteriological tests and number of requests), admission to emergency department, Charlson comorbidity index and American Society of Anaesthesiology (ASA) physical status classification system.

Microbiological data was limited to samples collected in orthopaedic SSI, namely bacteriologic or mycologic samples from blood, joint fluid, and exudate. Antimicrobial therapy included in-hospital antibiotics and antifungals, oral or intravenous. When counting duration of vancomycin and amikacin therapy, intermittent doses less than 48 h apart were considered consecutive, to avoid excluding patients with renal function dose adjustments. Antibiotics for surgical prophylaxis were not considered.

From the manual surveillance database, the following variables were used: demographic data (date of birth, sex), hospital admission (date of admission), surgical intervention (ICD-9 code of orthopaedic intervention, date of intervention, urgent intervention), ASA physical status classification system and SSI diagnosis (presence and type of infection). Surgical procedures from EHR data extraction and the manual surveillance databases were linked using the patient’s unique hospital ID and date of surgery.

Analysis

Frequencies and proportions were used to describe the sample characteristics, including missing data. Median and interquartile range were calculated to describe non-normally distributed data.

After datasets linkage, surgeries from EHR data extraction not included in manual surveillance database were compared with surgeries included in the manual surveillance to characterize the accuracy of the EHR data extraction procedure in identifying target surgeries and evaluate risk of selection bias in the manual surveillance.

Primary endpoint was all types of SSI (superficial, deep and organ/space, while secondary endpoint restricted the analysis to deep or organ/space infections. Sensitivity, specificity, positive predictive value, and negative predictive value were estimated for each possible algorithm, using the SSI diagnosis from manual surveillance database as the gold standard. 95% confidence interval was calculated for each of these measures. STATA version 16.1 was used in the data analysis.

Workload reduction, defined as the proportion of medical records requiring manual review, was measured by the following equation:

$$\:\text{W}\text{o}\text{r}\text{k}\text{l}\text{o}\text{a}\text{d}\:\text{r}\text{e}\text{d}\text{u}\text{c}\text{t}\text{i}\text{o}\text{n}=1-\:\frac{n\:high\:risk\:surgeries}{n\:surgeries\:performed}$$

Algorithm A false negatives and false positives were reviewed by an infectious diseases physician to identify possible algorithm gaps and misclassification in the manual surveillance.

Results

Sample characteristics

In the total sample, 67.5% (n = 1101) of patients were women, and median age was 69 years (IQR 62 to 77). Most surgeries were elective (92.5%; n = 1508) and half were hip arthroplasty (52.8%; n = 861). Global SSI incidence was 3.8% (n = 62), of which 64.5% were deep or organ/space infections (Table 1).

Table 1 Study sample characteristics

During the 90 days follow-up after surgery, 8.2% (n = 134) had at least one positive microbiological result, 8.3% (n = 136) had ≥ 3 microbiological requests, 9.3% (n = 152) received antimicrobial therapy (ATB) ≥ 7 days, 16.4% (n = 267) had length of hospital stay (LOS) ≥ 14 days, 14.1% (n = 230) were readmitted to Orthopaedics department, 9% (n = 146) underwent Orthopaedic reintervention and 17.7% (n = 288) attended the emergency department (Table 1).

Primary endpoint of all SSI types (superficial, deep and organ/space)

When tested individually, the variables with higher sensitivity to detect all types of SSI were positive microbiological culture and orthopaedic reintervention (64.5% and 59.7%, respectively). At least 3 microbiological requests and ATB ≥ 7 days presented similar sensitivity (58.1%), while LOS ≥ 14 days was the criterion with the lowest sensitivity (29.0%) (Table 2).

Table 2 Sensitivity (Sens), specificity (Spec), positive predictive value (PPV) and negative predictive value (NPV) of each variable included in the algorithms

From the algorithms described in Fig. 1, the one with better performance in terms of sensitivity was algorithm C with 85.5% sensitivity, 79.4% specificity, 99.3% NPV and 76.9% workload reduction. Algorithm A had a high workload reduction (91.7%), but low sensitivity to detect all types of infections (62.9%). Algorithm D presented the lowest sensitivity (32.3%) and highest workload reduction (97.1%) (Table 3).

Table 3 Sensitivity (Sens), specificity (Spec), positive predictive value (PPV), negative predictive value (NPV) and workload reduction for algorithms a, B, C and D

When testing 136 different algorithms ranging from two to seven variables, in which at least one criterion needed to be present during follow-up, the highest sensitivity reached was 90.3%, present in 24 different algorithms. Workload reduction of these algorithms ranged from 59.7 to 67.7%, specificity from 61.6 to 70%, and negative predictive value from 99.4 to 99.5%. In all these options, emergency department visit was present in the model, to increase the sensitivity to detect superficial infections (Table 4).

Table 4 Workload reduction, specificity (Spec), positive predictive value (PPV) and negative predictive value (NPV) for 24 algorithms with 90.3% sensitivity for the primary endpoint (all SSI types: superficial, deep, organ/space)

The algorithm with ≥ 3 microbiological requests, LOS ≥ 14 days and emergency department visit (from now on referred to as algorithm E) was the one with the higher sensitivity and a better balance between workload reduction and feasibility of implementation to detect all types of SSI (sensitivity 90.3%, workload reduction 67.7%, three variables in the model) (Table 4). Even though algorithm E presented less workload reduction when compared to algorithms A, B, C and D, it was the best option in terms of sensitivity and lowest number of variables (Fig. 2).

Fig. 2
figure 2

Comparison of sensitivity and workload reduction between algorithms A, B, C, D and E

Note: each bubble corresponds to a different algorithm and the size of the bubble reflects the number of variables

Secondary endpoint of deep and organ/space SSI

In the subgroup analysis of deep and organ/space infections, there was an overall increase in the sensitivity of all the variables when tested individually, except for emergency department attendance, whose sensitivity changed from 54.8 to 50.0%. Individually, positive culture, at least 3 microbiological requests, ATB ≥ 7 days and orthopaedic reintervention presented a sensitivity above 80%. LOS ≥ 14 days was the variable with the lowest sensitivity (37.5%) (Table 2).

Algorithm B detected deep and organ/space infections with 100% sensitivity, 70% specificity, 100% NPV and 68.8% workload reduction (Table 3). Algorithms A and C also presented high sensitivity and workload reduction (algorithm A: sensitivity 95% and workload reduction 91.7%; algorithm C: sensitivity 97.5% and workload reduction 76.9%) (Table 3). All algorithms in Table 4 had 100% sensitivity to detect deep and organ/space infections.

Selection bias assessment

During the study period, 1631 knee and hip surgeries were performed; however, 458 (28%) of them were not included in the manual surveillance database, probably due to selection bias. Patients excluded from the 2015–2017 manual surveillance were older (71 vs. 69 years old) and had more frequently hip surgery (63.1% vs. 48.8%) when compared to patients included. Concerning the algorithm variables, patients excluded had more frequent positive microbiological results (17.2% vs. 4.7%), at least 3 microbiological requests (17.0% vs. 4.9%), ATB ≥ 7 days (19.0% vs. 5.5%), LOS ≥ 14 days (31.9% vs. 10.3%) and orthopaedic reintervention (16.8% vs. 5.9%), when compared to patients included in the manual surveillance database (Supplementary material – Table 5).

Based on these findings, infectious diseases physicians reviewed and classified the presence of SSI according to the ECDC criteria in the 458 surgeries that were missing from manual surveillance database. The incidence of SSI was higher in the surgeries excluded from manual surveillance (5.7 vs. 3.1), of which 46.1% were organ/space infections (2.8% of all SSI types identified in manual surveillance) (Supplementary material – Table 5).

False negatives and positives review

During algorithm A false negatives review (cases not identified as high-risk by the algorithm, but with SSI in manual surveillance), four SSI were reclassified into no infection (three superficial and one organ/space), and two SSI were reclassified into organ/space infections (one superficial and one deep infection) (Supplementary material – Table 6). The patient with organ/space infection that was reclassified into no infection had indeed an SSI, but a couple of days after finishing the 90-day follow-up period. Additionally, 61% of the Algorithm A false negatives had at least one emergency department visit in the 90 days after surgery; as a consequence, this new component was added to the analysis.

Through the algorithm A false positives review (cases identified as high-risk by the algorithm, but without infection in the manual surveillance), six patients that were considered without infection in manual surveillance had indeed SSI (five deep and one organ/space infection) (Supplementary material – Table 6).

When re-testing the algorithms after false negatives and positives review, sensitivity slightly increased, particularly for detecting all SSI types (Supplementary material – Table 7).

Discussion

In our study, we tested four variations of an algorithm previously published and 136 new different algorithm combinations of seven criteria to identify all SSI types after hip and knee arthroplasty. In this analysis, the highest sensitivity reached by the algorithms was 90.3%, attained by 24 algorithms, where at least one criterion needed to be fulfilled to be considered high risk of SSI. Workload reduction estimated with the application of any of these 24 algorithms ranged from 59.7 to 67.7%, and the number of variables needed to be analysed from three to six. Aiming for the highest sensitivity and balancing workload reduction and the feasibility of implementation, the most appealing algorithm seems to algorithm E that includes at least three microbiological requests, LOS ≥ 14 days, and emergency department attendance (sensitivity 90.3%; workload reduction 67.7%; three variables). Based on the data from our hospital which performs around 600 hip and knee surgeries per year and assuming it takes around 15 min to review the medical records from one suspected SSI, we estimate that this algorithm can save 102 h of IPC staff per year.

Results from previous publications presented high sensitivity to detect deep and organ/space SSI in hip and knee arthroplasty, excluding superficial SSI. Sips et al. tested it in a Dutch hospital with data from 2004 to 2012 and reported 100% sensitivity with an estimated workload reduction of 95.4% [18]. Verberk et al. has expanded its application to four Dutch hospitals with data from 2012 to 2018 reporting a sensitivity between 93.6 and 100% to detect deep SSI, with an expected workload reduction of 98% and 98.5% [14]. Van Rooden et al. has also applied this algorithm in two European hospitals with data from 2017 to 2019, reporting a sensitivity that ranged from 83.3 to 100% and workload reduction between 96.9 and 97.5% [19]. In our cohort this algorithm A, which is the same published in these studies, yielded similar sensitivity with a slightly lower workload reduction, in the secondary endpoint of deep and organ/space infections (sensitivity 95.0%; workload reduction 91.7%). The differences in workload reduction may be due to the use of broader criteria in microbiological requests and antimicrobial therapy in our study. However, in our data, the performance of algorithm A to detect all types of SSI, including superficial SSI, was insufficient (sensitivity 62.9%).

Perdiz et al. applied a slightly different algorithm in a Brazilian centre to identify patients with all types of SSI after hip and knee surgery, using data from 2009 to 2012 [17]. The combinations with best performance (100% sensitivity) were: ATB ≥ 7 days or hospital readmission; and hospital readmission within 1 year after surgical procedure [17]. ICD-10 diagnosis codes suggestive of SSI yield a sensitivity of 87.5% and 100% in hip and knee surgery, respectively [17]. In our study we also tested the combination of ATB ≥ 7 days or hospital readmission, but its sensitivity to detect all types of SSI was lower (72.6%; data not presented). These differences may be because we restricted hospital readmission to the orthopaedics department within 90 days after surgery. In fact, the application of the single criteria Orthopaedics readmission presented a very low sensitivity (53.2%) in our study. Bolon et al. used the same variables as Perdiz et al., applied to 5 medical centres from USA between 2002 and 2005 to identify all SSI types after hip and knee arthroplasty [15]. A combination of the three variables (ATB ≥ 7 days, SSI diagnostic code or readmission) was the algorithm with the overall best performance in both surgeries (hip surgery 93%, knee surgery 86%) [15].

Inacio et al. have also tested the applicability of ICD-9 diagnostic codes extracted from EHR (inpatient and outpatient settings) to identify patients with total joint replacement SSI [16]. This study used data from 2006 to 2008 from a large health maintenance organization in California and reported 96.7% sensitivity to detect any type of SSI based on ICD-9 diagnostic codes, with an expected workload reduction of 90.8% [16]. In our study, we decided not to use ICD-10 diagnosis codes because they are usually reviewed by the hospital coding team months after patient discharge, so it would be unpractical to apply an algorithm including this criterion in real life practice.

We have also demonstrated that semi-automated surveillance can be more accurate and less susceptible to selection bias when compared to manual surveillance. When linking the data extracted from EHR with the retrospective data from manual surveillance, we realized that 28% of the surgical procedures were excluded from manual surveillance and this subgroup of patients presented a higher incidence of SSI, particularly organ/space infections. The false positives review has also identified six patients with deep and organ/space SSI that were considered without infection in the manual surveillance, highlighting the advantages of using semi-automated algorithms to decrease inter-individual variability.

Our study has the advantage of providing different algorithms possibilities with high sensitivity to detect all types of SSI and with an acceptable workload reduction. This can improve the flexibility and feasibility of real-life semi-automated surveillance implementation, allowing to use algorithms tailored to clinical practice and adapted to EHR data availability, and adjusting them to future changes in clinical practice. Additionally, we were able to identify a crucial variable to increase sensitivity to superficial SSI in our hospital, which is an important feature in algorithms to detect SSI after hip and knee surgeries, where superficial infections are usually treated as deep SSI [22,23,24]. It would be interesting to understand if emergency department attendance behaves similarly in other hospitals. We don’t expect superficial SSI incidence to be underestimated in our retrospective data due to loss of follow-up because all patients submitted to hip and knee arthroplasty had post-surgery consultations, where presence of SSI was registered by Orthopedic surgeons.

The main limitation of our study is the fact that we used a database for validation from 8 years ago. There may be differences in clinical practice, for example an earlier transition of intravenous to oral antibiotic therapy, which may decrease the LOS and in-hospital ATB days and might compromise the algorithm’s performance nowadays [25]. The pandemic may have reduced LOS in all hospital departments to increase available beds for COVID-19 patients. All algorithms previously published used pre-pandemic data for validation, so it would be interesting to understand if these algorithms maintain their performance in the post-pandemic era. Additionally, these algorithms were tested with data from a single hospital, their performance could be different when applied to data from other hospitals.

Conclusion

Various algorithm combinations with high sensitivity were identified that can be used in real-life implementation of semi-automated surveillance. Semi-automatic surveillance should be validated and designed according to hospital practices. Our study has demonstrated that emergency department attendance can be an important variable to consider in algorithms to detect all types of SSI after hip and knee surgery.