Background

Sepsis is a significant cause of health loss worldwide. This is particularly true for neonates as over three million cases of neonatal sepsis were reported worldwide [1]. It has been estimated that sepsis and meningitis account for 6.8% of newborn deaths globally, making them important causes of neonatal morbidity and mortality [2]. Sepsis and meningitis can be defined as early-onset (EOS), or late-onset (LOS), with different causative pathogens and risk factors associated with each type. EOS, infections occurring in the first 72 h of life, is associated with group B streptococcus, Listeria, Enterococcus, and Escherichia coli whereas LOS, infections occurring after 72 h of life, are associated with Gram-positives (such as Coagulase-negative staphyloocci, Staphylococcus epidermidis and Staphylococcus aureus) and Gram-negatives (such as E. coli, Klebsiellapneumoniae and Pseudomonas) [3]. Low gestational birthweight, prematurity, low Apgar score, prolonged rupture of membranes (PROM) and chorioamnionitis are associated with a greater risk of both EOS and LOS; however the use of central venous catheters, previous antimicrobial exposure and poor hand hygiene increase the risk of hospital-acquired LOS, which is associated with higher rates of antimicrobial resistance [3,4,5,6].

Antimicrobial resistance remains a global public threat. It has been estimated that 70% of neonatal bloodstream infections are untreatable with ampicillin and gentamicin which are recommended as first-line treatment by the World Health Organization (WHO) [7,8,9,10]. In the absence of accurate diagnosis and treatment of sepsis, those who survive risk delayed growth and development [11]. The clinical presentations of neonatal sepsis can be difficult to identify and consequently challenging to treat [12]. The gold standard for a definitive diagnosis of neonatal sepsis is the isolation of pathogenic bacteria from a blood culture, which should be performed before giving the first dose of antibiotic [3]. However, sepsis can present with non-specific clinical signs and rapidly progress to multisystem organ failure without appropriate treatment [3]. For this reason, it is recommended to initially prescribe empiric antibiotic therapy for suspected neonatal sepsis. The empiric therapy consists of ampicillin plus an aminoglycoside, such as gentamicin, or cefotaxime [3]. Different antibiotics may be prescribed if antibiogram data show bacterial resistance patterns to the original prescription.

The current management of diagnosing and treating neonatal sepsis is expensive and time consuming. Furthermore, many resource-limited settings may not have the laboratory facilities or are understaffed in microbiology departments [13]. The absence of culture susceptibility testing increases the likelihood of receiving an inappropriate or ineffective course of antibiotic treatment. Antimicrobial resistance will invariably be exacerbated by overprescribing and the incorrect use of empirical antibiotics [14,15,16]. High levels of resistance for ampicillin and gentamicin have been reported in Gram-negative pathogens such as K. pneumoniae and E. coli in low-middle income countries (LMICs) [7,8,9, 17]. The high resistance rate for these antibiotics may lead to the use of carbapenems and third-generation cephalosporins as first-line treatment for neonatal sepsis. Despite the WHO guidance, a study has shown only 20% of neonates received WHO-recommended first-line treatment for neonatal sepsis in 41 countries and meropenem was predominately prescribed as first-line treatment in LMICs [9].

In the past decade, electronic health records (EHRs) have been extensively utilised for health care, including antibiotic stewardship research and to monitor antibiotic use [18]. In order to handle the complexity of EHRs, machine learning (ML) techniques have gained popularity as a powerful tool to identify data patterns and perform accurate predictions. ML is an analytical tool devised from disciplines like computer science, statistics, and mathematics for spotting patterns in data and exploiting these patterns to address a task [19]. ML aims to develop a model that best describes the available dataset, and the resulting model will improve automatically through the experience of encountering increasing amounts of data [19]. Studies have demonstrated the clinical value of applying ML for image processing to identify early signs of diseases and cancer diagnosis [20, 21].

Numerous studies have used EHRs to create prediction models to improve diagnosis, develop new medicine, and improve care [22,23,24]. There is a plethora of information to be mined due to the increase volume of data. For data scientists and healthcare professionals desiring to harness the vast and complex healthcare data, the development of machine learning technology is especially useful [25]. ML has been demonstrated to predict the likelihood of acquiring Clostridium difficile infections in hospitals and to predict patients with a significant risk of developing septic shock in adult populations [26, 27]. ML technology has already been used to aid clinicians in determining the most appropriate empiric antibiotic to prescribe for different infectious diseases in adult populations [28,29,30].

Classification ML models such as naïve Bayes classifier, k-nearest neighbours, support vector machines, decision tree methods (e.g. boosted decision trees and random forest), and the deep learning technique neural networks can handle categorial data well and may provide greater opportunities to address diagnosing and treating neonatal sepsis [19, 31,32,33,34]. The aim of this scoping review was to summarise the application of ML on neonatal sepsis treatment including the parameters that are required to perform ML, the common ML techniques applied, and to reflect on future directions.

Methods

Three databases (PubMed, Embase and Scopus) were searched to identify relevant literature published from inception to 26th November 2022. The Medical Subject Headings (MeSH) terms and keywords related to ML, neonatal sepsis, and antibiotics were searched. The search strategies were carried out with the following concepts: “sepsis”, “neonates”, “antibiotics”, and “machine learning” with slight adjustments made as suitable to each database (Appendices 13). Articles applying ML techniques to guide diagnosis, antimicrobial resistance, or treatment of neonatal sepsis were included. We included studies which applied ML modelling to predict bloodstream infectious disease management (diagnosis and treatment) in neonates aged < 28 days. Studies published in languages other than English were excluded along with review articles, letter, conference proceeding, and animal studies.

Three reviewers (DHT. Tsai, ICY. Wu, and C. O’Sullivan) independently conducted the initial screening of titles/abstracts for inclusion. Full-text article screening and data extraction were performed independently. Any disagreements were resolved by consensus.

Details of each included study were extracted, including number of patients, country where the study was conducted, ML techniques, and missing data handling (e.g. imputation method). We did not perform a risk of bias assessment in our review due to a lack of appropriate bias assessment tools for ML studies. Figure 1 shows the flowchart of the articles included in the review.

Fig. 1
figure 1

PRISMA flow diagram of numbers for studies identified in PubMed, Embase and Scopus

Results

From 88 records, 18 articles met the inclusion criteria and were included in the scoping review. A list of included studies is summarised in Table 1. Most studies were published within the past 5 years and were conducted in the United States of America (USA). The majority of studies developed models which predict the occurrence of neonatal sepsis in a timely manner while analysing the accuracy of certain biomarkers to predict neonatal sepsis [35,36,37,38,39]. One study focused on predicting mortality in hospitalised neonates with suspected sepsis [40]. Three studies focused on antibiotic treatment, of which one study focused on predicting treatment failure in antibiotic use [41]. Another developed a model to predict antimicrobial resistance in Enterobacterales isolates [42]. Oonsivilai et al. aimed to predict neonatal and paediatric bloodstream infections resistant to WHO-recommended antibiotics, a combination of ampicillin and gentamicin or ceftriaxone [43]. Generally, populations consisted of all infants admitted to the neonatal intensive care unit (NICU) with sepsis, although some studies focused on predicting EOS or LOS [37,38,39, 41]. Studies were mostly conducted using EHRs collected in a single centre, typically the NICU at a hospital located in the USA [37, 44,45,46]. One study collected data from 17 centres across multiple countries (The Netherlands, Canada, Czechia, and Switzerland) [39]. Thirteen studies used neonatal data from high income countries (HICs), and four studies used neonatal data from China and Cambodia. The scope of the review is limited as only studies published in English were included. It is possible that more studies report the use of machine learning in neonatal sepsis and were not included.

Table 1 Study characteristics of included studies of machine learning application on neonatal bloodstream infection

The size of the datasets used in the included studies ranged from 38 to 2,749 patients. Missing data are inevitable for various reasons in EHRs. However, not all included studies explicitly described how missing data were handled. Few studies performed complete case analysis and variables with high levels of missing data were removed. Many neonates were not included for analysis if their key variables (such as age, weight, temperature, white cell count etc.) or blood culture results were not captured [35, 42, 46, 47]. Sample imputation was performed for two studies, in which missing data were replaced with either mean values or last observation carried forward (LOCF) [44, 45].

Risk factors with a strong association with neonatal sepsis, such as low gestational birthweight, prematurity, low Apgar score, prolonged rupture of membranes (PROM) and chorioamnionitis were included parameters in most models included in this review. However, the strongest predictors across most studies focusing on neonatal sepsis were gestational age, C-reactive protein levels, white cell count, platelet count, heart rate, and respiration [35, 36, 38, 39, 41, 45, 46]. Change in skin colour (pink was considered normal but a neonate turning green/grey was considered a significant change) was the strongest predictor in the Neonatal Sepsis Diagnostic (NeoSeD) model in Greece for predicting neonates with suspected sepsis [35]. Machine learning has also proven to be useful in the early detection of neonatal sepsis with meningoencephalitis, a complication associated with severe sepsis. Serum creatine and pyruvic acid, two metabolites found in cerebrospinal fluid are important parameters for differentiating neonates with septic shock and neonates with meningoencephalitis [36].

Blood glucose levels was found to be important for predicting antibiotic treatment failure and along with age in days and weight, days from hospital admission to blood sample was important for predicting WHO recommended antibiotic-resistant neonatal and paediatric bloodstream infections [41, 43]. Hospital-acquired infections was also an important variable for predicting ceftriaxone resistance and remarkably, the household size demographic proved to be useful in predicting ampicillin and gentamicin resistance [43]. A strong association between overcrowded households and infectious diseases has been evidenced, with higher rates of infectious disease transmission, and in turn antimicrobial resistant infections, being reported in households with more than eight persons [48, 49].

Blood pressure and temperature were important for predicting serious infections and urinalysis was important for predicting young febrile infants at risk of serious bacterial infections (including bloodstream) [47, 50, 51]. For predicting in-hospital neonatal mortality due to sepsis, the requirement of ventilator support at the onset of clinical suspected sepsis, feeding conditions and intravascular volume expansion were the strongest predictors [40].

Cross-validation was the most employed method to validate a ML model [35, 38,39,40,41, 43,44,45,46]. All studies reported the area under the receiver operating characteristics curve (AUC), and/or sensitivity and specificity as a metric to evaluate performance. The AUC for the best performing models for predicting sepsis in neonates ranged from 71.0–91.8% and the best performing models were logistic regression, boosted decision trees and random forest [35, 45, 46]. Sensitivity ranged from 38.0% to 94.0% and specificity ranged from 20.0% to 88.0% [38, 44,45,46]. Notably, not all ML models which gave high AUC values also provided high values for sensitivity and specificity; therefore it is important to include all these metrics to assess the full performance of the model [46]. Both logistic regression and decision trees performed well in predicting antibiotic treatment failure in neonates with sepsis, and random forest was the best performing model for predicting WHO recommended antibiotic resistant infections in neonates and children [41, 43]. Similarly, random forest had the highest specificity (74.9%) and highest sensitivity (98.6%) when predicting serious bacterial infections in febrile infants [47].

The highest performing model in predicting neonatal sepsis mortality was a neural networks model (AUC: 92.3%) [40]. This model was highly accurate (95.6%) and specific (96.8%). However, this model differs from the models mentioned earlier, as a neural network model learns data in a similar sequence to the human brain [19]. In Bosnia and Herzegovina, neural networks were used to diagnose neonatal sepsis and performed much better than the machine learning models mentioned earlier, the model was highly sensitive and specific (98.8% and 95.5%) using data regularly collected at hospitals (body temperature, C-reactive protein, white blood cell count and platelet count) [52]. Neural networks generally provide more accurate models, but this accuracy is at the expense of transparency as these models are typically more complex [19]. Figure 2 shows the parameters to include for applying machine learning for neonatal sepsis identification and treatment, and which models have been reported to perform best for each task.

Fig. 2
figure 2

Schematic diagram of different applications of machine learning for neonatal sepsis including important parameters to include and which models are worth investigating for each task

Discussion

We performed this scoping review to summarise the published literature on ML application for neonatal sepsis diagnosis and treatment. From the included studies, the highest performing ML models in terms of AUC, sensitivity and specificity were the following: decision trees, random forest, and neural networks (a deep ML technique). Many included studies have failed to adequately handle missing data and did not explicitly describe if and how overfitting was prevented.

Typically, the performance of ML models improves when there are greater amounts of data. In Israel, a study by Yelin et al. developed ML models which could predict antimicrobial resistance for adults with urinary tract infections (UTIs) with great accuracy by linking 10-year longitudinal data of over 700,000 community-acquired UTIs with over 5,000,000 records of antibiotic treatment [29]. In comparison, the datasets of the studies included in this review were much smaller. This was expected due to the difficulties in acquiring neonatal data, therefore smaller datasets may limit the performance of ML models [12]. To ensure a ML model is providing sound and reliable results, it is important to validate the model. The recommended approach for developing ML models is to split the dataset into three sets: a training set, validation set, and testing set with cross-validation being the most widely employed validation ML technique for neonatal bloodstream infections [35, 38,39,40,41, 43,44,45,46].

The best data to train a ML model for neonatal bloodstream infections should include clinical data and laboratory data. All studies included in this review utilised demographic, clinical and laboratory data. The important clinical data to include were weight, age, days from admission to blood cultures taken, and information related to respiration such as ventilation status. Depending on the purpose of a ML model, the relevant laboratory data to include may differ. When developing ML models to predict the occurrence of a bloodstream infection in a neonate, biomarkers such as C-reactive protein and white blood cell count were essential. Antibiogram data should be included when developing ML models predicting resistant bloodstream infections [43, 53].

Using EHRs from a hospital to build a ML model is an approach that has been widely applied as observed in this review. However, the results from one single hospital cannot be generalised, meaning that the developed ML model is unlikely to give accurate predictions when the same ML model is applied to other hospitals, ultimately limiting the external validity [25]. Despite the significant threat antimicrobial resistance imposes globally, only three studies were identified which applied ML models to assist antibiotic treatment in this review. ML approaches could be employed to predict resistant patterns in neonatal bloodstream infections and in turn could assist clinicians in prescribing appropriate empiric antibiotic use.

In this scoping review, we limited our literature searches to three databases. We may miss some papers not indexed in these three databases. Also, we only included papers published in the English language. Currently, there are two reporting guidelines which were developed to facilitate the risk of bias assessment for prediction models: The Prediction Model Risk of Bias Assessment Tool Artificial Intelligence (PROBAST-AI) and the Transparent Reporting of a Multivariable Prediction Model of Individual Prognosis or Diagnosis (TRIPOD) statement [54,55,56]. As both suggested guidelines to evaluate the quality of prediction models are under development, we did not perform the risk of bias assessment in this scoping review.

Future research

In the past few years, ML techniques have been used extensively to model EHRs and improve health care. Our review has shown that ML models have shown promise in assisting neonatal bloodstream infection management, of which the random forest model was the best model to predict the treatment outcome. To accurately identify patterns and produce predictions, it is essential to have a large volume of data. However, data collecting is generally labour-intensive, and it is even more challenging to collect data for neonates. This could be due to ethical complications or a lack of suitably trained staff, subsequently it results in smaller datasets being available [12]. Future studies should focus on developing and training models on bigger datasets and potentially validating these ML models via clinical trials similar to other computer based clinical tools [57].

A multimodal dataset is one approach to overcoming insufficient data. Linking data from multiple sources such as clinical and laboratory data, antibiogram data and genome sequencing data to create a multimodal dataset could improve disease detection, prognosis, and treatment for infectious diseases [12, 58]. However, genome sequencing is costly and may not be accessible to resource-limited regions. Another suggested approach would be to design a ML model which can predict laboratory results such as antibiogram results, eliminating labour-intensive laboratory data collection.

Conclusion

This scoping review demonstrates the application of ML for neonatal bloodstream infection treatment. Despite the difficulties and barriers to obtaining data in this population, ML techniques have shown potential in making accurate diagnosis and treatment. However, there remains a lack of studies focusing on applying ML techniques to guide antibiotic selection to treat neonates with infectious diseases. Utilising ML techniques to assist clinicians in prescribing appropriate antibiotics for infectious diseases is still in its infancy. It warrants establishing a national and/or international network to build standardised structural data acquisition. This will bring ML application to a level where it can substantially reduce inappropriate antibiotic use in clinical practice.