Introduction

The recent decades have witnessed an explosion in the amount of clinical data collected during healthcare provision [1]. The increasing availability of digital health data contained in clinical information systems and databases has facilitated marked improvements in efficiencies of care, ability to track and measure quality and outcomes, and to conduct research. However, this plethora of data has often exceeded our ability to comprehensively integrate it and optimise its application into improved clinical care. Recent advancements in machine learning and artificial intelligence are increasing opportunities for enhanced use for research, administrative, and patient care purposes [2, 3].

Severe infections, including those associated with sepsis and septic shock, are responsible for a major burden of illness in patients admitted to intensive care units (ICUs) [4, 5]. Infections may be invasive (i.e. an organism is isolated from a normally sterile body site such as blood, cerebrospinal fluid, or aspirates from deep tissue) and may affect any body system. These may manifest with features of severity that may include direct complications (i.e. brain injury due to meningitis), associated secondary organ failures (i.e. renal failure due to severe pneumonia with septic shock), and high case fatality. Septic shock is a syndrome that arises due to infection and is manifested by refractory hypotension and elevated serum lactate [6]. Severe infection therefore represents an entity that may encompass a range of definitions, treatment locations, and presence of septic shock.

The objective of this report was to review contemporary studies investigating the use of electronic data aimed at improving the understanding of the determinants, management, and outcomes of severe infections, with a focus on contemporary studies conducted in ICUs.

Literature Search

We searched for novel studies that investigated the use of digital information to inform the epidemiology, management, and outcome of severe infections in adults. Ad hoc searches were performed using PubMed using terms including “infection”, “sepsis”, “septic shock”, “intensive care unit”, and “electronic”, with supplementary searches of relevant bibliographies. These were limited to articles published from 2018 through to June 2023 and to humans, those with abstracts, and reported in the English language.

Electronic Surveillance

Tracking the occurrence of an infectious disease or syndrome is important for quality improvement, research, and administrative purposes and to inform measures to reduce the burden of disease. The concept of electronic surveillance, where digital information sources are connected and manipulated to support efforts to track and manage infectious diseases, has evolved significantly over the past two decades [7]. While traditionally surveillance has been a manual process of individual case review, exploiting digital data has facilitated surveillance of wider range of illnesses across larger populations. It is noteworthy that surveillance of severe infections is challenged by availability of adequate definitions and practical means of their application. On one hand, surveillance for invasive infections may be defined by a positive culture from a normally sterile body site as identified by microbiology laboratories [8, 9]. On the other hand, surveillance for clinical syndromes such as sepsis is more complex, does not have a universal “gold standard”, and requires integration of clinical, microbiology, diagnostic, and treatment information [10].

Streefkerk et al. recently conducted a systematic review (inclusive to January 2018) of electronic surveillance systems in hospitals and identified 78 studies [11]. Although they were inclusive of hospital populations at large, more than one-half of the studies either included or focussed on ICU applications. Overall, they found that sensitivity of electronic surveillance systems was generally high (> 80%); however, specificity was highly variable. Studies were generally limited by lack of demonstration of efficiency, and few were comprehensive systems that looking at a range of infections and settings.

Jones and colleagues reported on a hospital surveillance program that examined non-ventilator hospital-acquired pneumonia (NV-HAP) by using electronic data from more than 6 million hospitalizations in 284 institutions in the USA [12]. A series of objective definitions using oxygenation, temperature, white blood cell count, chest imaging, and antimicrobial use were applied to ascertain the presence of NV-HAP patients from within electronic health records. They found that 32,797 cases had NV-HAP for an incidence of 0.54 per 100 admissions and 0.96 per 1000 patient days. Clinical audit of 250 cases confirmed NV-HAP by at least one reviewer in 81% of cases with low to moderate interobserver variability observed. Although only one-quarter of cases were related to ICU admissions, this study represents a major work to validate electronic surveillance data in a large population.

In another recent study, Schaumburg et al. reported on use of electronic surveillance for ICU-acquired primary bacteremia and compared results with traditional manual chart review [13]. They included patients admitted to five ICUs in Germany and found that approximately 75% of antibiotic resistant infections and 85% of primary bacteremia/sepsis cases were classified concordantly. Although they examined agreement, they did not, however, compare each to a composite gold standard such that a superior approach could be determined.

Gerver et al. reported on BSI surveillance results from 194 ICUs in England [14••]. They applied standardized definitions for to data obtained by linkage across multiple datasets and found 433 ICU-associated BSIs that represented 4.9 per 1000 occupied overnight bed-days and a central venous catheter (CVC)-associated rate of 2.3 per 1000 ICU-CVC days among adults. This study emphasized the value of objectivity of electronic surveillance data to facilitate valid inter-site comparisons.

While many electronic surveillance systems emphasize the use of microbiology laboratory and other diagnostic results for detecting cases, a wealth of clinical information may be obtained in the text-based clinical records maintained in clinical information systems. Yan and colleagues conducted a systematic review (through to December 2020) of studies examining the use of natural language processing in electronic health records to identify cases of sepsis [15]. These authors included nine articles in their synthesis of which six were ICU-focussed. Although these studies were heterogenous and precluded meta-analysis, overall, they showed that combining structured data within clinical text improved the identification and early detection of sepsis in patients admitted to ICUs [2, 16,17,18]. Additionally, Vermassen et al. used natural language processing of electronic health record data to identify cases of septic shock among 8911 admissions to a Belgian ICU [19]. They compared results from a prospectively collected infection database, hospital administrative coding, and two different automated search strategies based on written sections of the electronic health record. They observed that the automated combined strategy performed well overall with sensitivity and specificity of 73% and 93%, with higher sensitivity than hospital coding alone (56%).

Electronic Data to Inform the Understanding of Determinants and Outcome of Severe Infections

Gaining knowledge of the epidemiology, including the occurrence and risk factors for developing and dying from severe infections, is important for health service planning and to develop new means to prevent or reduce the burden of severe infections. While clinical trials are considered the gold standard for evidence for interventions, observational studies are needed for defining the natural history of disease, investigating rare conditions, and providing insight into treatments where clinical trials are either not feasible or unethical. As a result of realization of efficiencies, electronic data to support observational studies allows much larger, more comprehensive investigations with less resource input as compared to traditional case-by-case review studies.

Large multicentred studies and critical care registries have played major roles in developing new knowledge related to severe infections across the globe [20,21,22,23,24,25]. However, in most cases, established ICU registries employ at least a component of manually entered data. The increasing use of electronic health records and application of machine learning techniques has further created opportunities to study the epidemiology of severe infections in large ICU cohorts including at the population-based level. Although recent topic areas have been influenced by the pandemic, there is a wide range of contemporary studies investigating severe infections.

COVID-19

Three large studies have been reported from the USA that have examined comorbidities and outcomes related to COVID-19. Fiore and colleagues utilized electronic health record data from more than 1 million COVID-19 patients across 21 US health systems [26••]. The analysis cohort comprised data from 104,590 adult hospitalized COVID-19 patients where they observed decreasing rates of ICU admission and case fatality during surveillance and identified males, public insurance, obesity, and age as important determinants of adverse outcome. In another study from the USA, Rivera et al. examined statin usage as a determinant of COVID-19 severity among patients aged 40 years and older in Chicago [27]. Electronic health records were used to study a cohort of 15,524 individuals, and after controlling for confounding variables, statin use was associated with a modest reduction in ICU admission but no difference in case fatality. Kim et al. used a large dataset derived from electronic health records from more than 700 hospitals and 7000 clinics in the USA [28]. They found that among 27,639 patients with COVID-19, 18,460 had at least one past or current cancer diagnosis; cancer was associated with higher risk for hospitalization and death but not ICU admission as compared to those without this comorbidity.

Descamps et al. used linked electronic data from several databases in France to investigate the relationship with pre-existing psychiatric disorders and risk for development of COVID-19 in a cohort of 97,302 patients [29]. Although the risk for ICU admission was lower with psychiatric disease than for propensity-matched controls, the overall case fatality was higher.

Several large studies investigating COVID-19 epidemiology in England have recently been reported. Gao et al. used linked hospital, registry, and vital statistics data to investigate the role of obesity on the determinants and outcome of patients with severe COVID-19 in England [30]. The base cohort consisted of 7 million subjects. The authors found a linear increase in admission to an ICU across the whole body mass index (BMI) range and that a BMI > 23 kg/m2 was associated with a linear increase in risk for admission to hospital and death. Aveyard and colleagues examined underlying lung disease as a risk factor for hospitalization and dying from COVID-19 [31]. They included more than 8 million subjects from 1205 general practices in England, with linkages to several other databases. Overall, 14,479 (0.2%) were admitted to hospital with COVID-19, 1542 (< 0.1%) were admitted to ICU, and 5956 (0·1%) died. They observed small but significant increased risks for hospitalization, ICU admission, and severe COVID-19 associated with several chronic lung diseases. In another English study inclusive of more than 17 million patients, some minority ethnic populations were found to have excess risks of SARS-CoV-2 positivity, ICU admission, and adverse outcome [32].

Other notable studies that have investigated COVID-19 epidemiology included a national electronic health record study among renal dialysis patients in Qatar that reported a case fatality rate of 15% overall and 50% for those admitted to ICU [33]; an electronic health record-based study of 10,454 patients from Galicia, Spain, that reported 284 (2.7%) were admitted to ICU [34]; an electronic health record study that identified comorbidities as significant risk factors for death among COVID-19 patients in Golestan province, Iran [35]; and a study that examined free text within 1.5 million electronic health records using natural language processing and machine learning to identify significant differences in demographic, symptomatology, investigation, hospitalization, and ICU admission among males and females among 4780 patients with COVID-19 [36].

Sepsis

Alrawashdeh et al. examined the determinants of outcome of 329,052 patients with community-onset sepsis admitted to 373 US hospitals and specifically investigated the presence of comorbidities on outcome [37]. They obtained electronic data from three independent datasets and applied Centers for Disease Control and Prevention Adult Sepsis Event surveillance criteria to identify cases. They found that while only a small proportion of cases (2.6%) had no comorbid medical illnesses, the case fatality rate was notably high (22.8%) in this group. Oud and Garza used statewide data from Texas, USA, to investigate the occurrence and determinants and outcomes of patients with sepsis among those who have multiple sclerosis (MS) admitted to ICU [38]. They identified nearly 20,000 ICU admissions among patients with MS of which 6244 had sepsis; presence of this syndrome was associated with a fourfold worsened outcome. Rhee et al. used detailed electronic health data from 136 hospitals in the USA during 2009–2015 to investigate hospital- versus community-onset sepsis [39]. Among nearly 100,000 sepsis admissions, they showed that ICU admission rates (61% vs. 44%) and case fatality (33% vs. 17%) were significantly higher with hospital- as compared to community-onset sepsis cases.

Automated Alerts and Sepsis Management

Electronic health records may be configured to screen for the presence or predictors of severe infection and provide prompts to clinicians to act. Warttig et al. conducted a systematic review of randomized controlled trials comparing automated sepsis-monitoring systems (i.e. automated alerts sent from system) to standard care (i.e. no automated alerts) among patients admitted to ICU with sepsis [40]. They screened articles through to September 2017 and only identified three studies for inclusion. These were deemed to be of low quality and had very limited reporting of specific details. Furthermore, it was not clear as to whether these were entirely independent or represented sub-studies of one larger one due to two being reported only in abstract form [41]. They concluded that there was inadequate data to address question as to whether automated alerts improved outcome.

Zhang et al. conducted a more recent systematic review and included both interventional and observational studies [42•]. However, only six were ICU focussed, and of the ICU clinical trials, all were published prior to 2018 [41, 43, 44]. Furthermore, one of these was not an electronic information system alert per se, as it evaluated a wireless vital sign monitor [44]. Among the two clinical trials, Shimabukuro and colleagues found that use of a machine learning-based algorithm sepsis surveillance system reduced length of stay and improved outcome [43], whereas Hooper et al., while demonstrating feasibility, did not find a difference in outcome with the alerts [41].

Although there is a lack of contemporary randomized clinical trial data available on alerts, several non-concurrent (i.e. pre-post) related studies have been published. Jung et al. evaluated a bedside surveillance system integrated with the electronic medical record in a surgical ICU and found that as compared to pre-intervention historical controls, the use of the system reduced the time to antibiotic administration and was associated with shorter length of stay [45]. Burdick et al. [46] assessed the performance of a machine learning algorithm for severe sepsis prediction and detection using electronic health record data in nine diverse hospitals in the USA. A clinical outcome analysis was performed to evaluate the effect of the algorithm on in-hospital patient mortality, hospital length of stay, and 30-day readmissions. Alerts were given with implementation to clinicians. Pre-post analysis demonstrated decreased length of stay (4.8 vs. 3.3 days), readmission by 30 days (36.4% vs. 28.1%), and in-hospital case fatality (3.9% vs. 2.3%). Lipatov and colleagues evaluated an electronic health record-based sepsis alert and augmented clinical support in a pre-post implementation study of 1950 patients admitted to medical ICU in the USA [47]. Although found poor positive predictive value of 28%, a high negative predictive value of 97% was observed. However, there was no effect on outcomes [47].

Adams et al. conducted a cohort study among 590,746 patient encounters at five American hospitals and evaluated a sepsis alert system (targeted real-time early warning system (TREWS)). They found that patients who had TREWS alerts (n = 6877) responded to by a provider within 3 h had a 3.3% reduction in in-hospital death as compared to those with later responses [48].

Roimi et al. developed machine learning algorithms to predict positivity of blood cultures in the diagnosis of ICU-acquired bloodstream infection using electronic health record data in hospitals in the USA and Israel [49]. Among a population of 7419 patients, 32% had at least one set of blood cultures drawn. The algorithm predicted positivity with a sensitivity of approximately 90% and specificity of 69–86%. While not tested in practice, this study demonstrated the use of machine learning techniques to potentially flag patients at higher risk for bloodstream infection and its complications.

Discussion

In this review, we identify and review three themes related to the exploitation of electronic data for surveillance and reporting, epidemiological analysis and understanding, and real-time monitoring and alerts related to severe infections. While the body of contemporary literature is limited, it is evident that major contributions to the literature have been made by studies harnessing electronic data related to severe infections and that we may reasonably expect these to increase if not become the standard in the coming years.

Traditionally, surveillance for infections has been a laborious process based on case-by-case review by trained individuals with manual entry of results into reporting databases [50]. As a result, surveillance is often limited by resources such that many programs may be limited to select types of infections (i.e. bloodstream or wound infection) and/or to certain settings (i.e. ICUs, surgical wards). Electronic surveillance systems, because they utilize existing information in electronic health records or clinical information systems are broadly applied, are much more efficient. Indeed, many of the studies included in this review report on surveillance of hundreds of thousands or even millions of patient days of observation [12, 14, 39]. The vast resources required to perform such studies using traditional approaches would have precluded their conduct. In addition to efficiency, it is important to note that because electronic surveillance systems require application of objective, explicit definitions, they have the added benefit over traditional approaches by reducing bias and interobserver variability [51].

Reported rates of sepsis-related case fatality are decreasing across many jurisdictions globally [52]. While there is debate as to why this may be the case, the decreasing proportion of deaths among cases means that larger sample sizes are required to detect statistically significant differences in outcome among studies. Additionally, patients are becoming more complex with increasing comorbidities and other variables that may confound analyses. In response, very large sample sizes are required in many studies. The use of electronic data sources is practically requisite for efficient conduct of studies involving tens of thousands of patients or more.

The SARS-CoV-2 pandemic has highlighted the importance of electronic data in response to a novel infectious disease threat. Indeed, the explosion of research publications that followed within weeks of pandemic onset was facilitated by use of digital data sources. Our review included several studies that provided insight into the epidemiology of COVID-19 and its associated complications [26,27,28,29,30,31,32]. Perhaps more importantly is that the determination of optimal therapy of patients with COVID-19 was rapidly established using novel research platform trials [53]. Registry-embedded trials that utilize electronic data collection platforms are increasingly being employed in critical care clinical trials [54•]. It is our contention that these designs will become the new standard for efficient conduct of trials not only with severe infections but broadly within critically ill populations.