Artificial intelligence (AI) is increasingly visible of in our daily lives and ranges from voice recognition on smart speakers (e.g., Amazon’s Alexa), to discovering new music from streaming applications that predict new artists for the listener (e.g., Spotify), and computer detection of cancer in mammograms [1]. AI uses mathematical tools, “machine learning,” to iteratively learn patterns within training data and when these patterns are found in new data, the AI translates this into a decision, for example, cancer versus not cancer (Table 1). In recent years, a subfield of AI, “deep learning,” has delivered a significant increase in accuracy by using new learning approaches, specialized hardware, and significantly larger datasets to find more complex and subtle patterns within the data.

Table 1 Definitions of artificial intelligence and subdomains

The potential of AI in clinical medicine is wide ranging and has been driven in recent years by the increased availability of large health datasets due to digitization of health records coupled with sharing of anonymized health data. AI for diagnosis using imaging data has potential in diverse fields such as the pathological diagnosis of cancer, diabetic retinopathy and glaucoma screening in primary care, and self-monitoring of skin lesions by patients. Other clinical applications include genomic/phenotypic profile tailored disease management and improved clinical event prediction to inform preventative programs from risk factor data or laboratory results [2•, 3••].

In infection prevention and control (IPC), AI applications offer huge potential for implementation of the World Health Organization (WHO) core components [4]. As healthcare IT systems produce vast quantities of data from disparate sources and become increasingly integrated, AI systems will be able to detect patterns in the data accelerating the detection of outbreaks and providing richer datasets for subsequent analysis. AI can support the case for system change by identifying the cost of inaction, modeling solutions by simulating the behavior of different types of agents within a complex system and supporting change by gathering data and producing analytics [5]. Social graph analysis can identify “influencers” of hand hygiene programs and explore patient safety culture [6, 7]. In IPC education and training programs, AI-based simulations can provide a bridge to authentic experience that does not compromise patient safety and provide the repeated cycles of objective evaluation and feedback that are key to the learning process [8].

In this paper, we explore the potential benefits of AI for IPC in three key areas highlighted by the WHO, namely surveillance of healthcare-associated infection (HAI), improved laboratory diagnosis to facilitate IPC interventions, and hand hygiene education and audit.

Surveillance of Healthcare-Associated Infection

The essence of an HAI surveillance program is to interpret databases generated from multiple data sources to prospectively monitor trends, identify clusters and outbreaks in a timely fashion, track the impact of quality improvement programs, and predict future trends. An HAI social network generated from electronic healthcare record (EHR) patient and caregiver contacts was used to simulate outbreaks of methicillin-resistant Staphylococcus aureus and influenza, and identify potentially mitigating interventions [9]. Machine learning applications have been used to predict the risk of nosocomial Clostridioides difficile infection (CDI) [10•,11,12]. Unlike traditional CDI risk stratification, machine learning is not limited to known risk factors but can consider a range of variables within the EHR to validate the application, and thereafter, models can be developed tailored to a particular healthcare facility or patient population. Machine learning applications can also more readily cope with the dynamic nature of healthcare than traditional surveillance models, whereby if a patient’s CDI risk changes during their inpatient stay, the IPC and clinical team could be alerted accordingly. While prospective studies are required to validate these publications and study other HAI, this tailored approach has the potential to transform HAI surveillance and IPC. Timely accurate identification of patients at high risk of CDI and those at high risk of progression to complicated CDI could facilitate customized IPC and antimicrobial stewardship strategies and anti-CDI therapies [10•]. This approach should also be beneficial for clinical trials of novel anti-CDI therapies whereby patients most at risk of CDI can be readily identified for recruitment.

In the clinical microbiology laboratory, AI data mining of routine microbiology laboratory results could be used to detect and predict clusters/outbreaks of multidrug-resistant organism colonization and/or infection events [13]. This type of analysis could also facilitate detection of potential sources of these events which is frequently a difficult and time-consuming aspect of epidemiological investigation. Next-generation sequencing (NGS) is being increasingly used for pathogen identification, antimicrobial resistance (AMR) detection, and strain typing [14, 15]. AI offers the opportunity for more complex analysis of NGS-generated data with the potential to integrate and analyze diverse HAI and AMR data across the healthcare system. This type of integration could help predict patients most at risk of HAI and AMR events and facilitate timely detection of outbreaks with molecular analysis of transmission events and interactions between patients, staff, and the clinical environment in real time. This will improve our understanding of how cross infection occurs, allow focused surveillance, earlier diagnosis, and enable targeted IPC interventions to be developed accordingly.

One of the major challenges for AI in HAI surveillance is the requirement for a high-quality representative dataset to develop accurate models for each context in which they are used. A systematic review of machine learning in critical care noted that many studies use datasets that are too small to assess the full potential of AI applications and guidelines on methodology and validation of predictions are required to help translate findings into daily clinical practice [16]. In addition to size and completeness, preexisting databases may be inherently biased by clinical practice and healthcare delivery at that time which may compromise patient care if these biases were incorporated into a machine learning model. As with any publication or guideline, the generalizability of machine learning models for HAI surveillance developed from data in a single healthcare setting is likely to be limited. For example, in one CDI model, risk factors for CDI in one setting were protective in another institution, which may reflect local biases or differences in CDI pathways [12].

More recently, the applications of AI in the novel coronavirus (COVID-19) outbreak have highlighted its potential for generation of near-real-time information for public health and IPC purposes. AI could facilitate decision-making when large amounts of data were emerging rapidly, by analyzing data from a variety of sources (e.g., government and national reports, social media, news outlets) to generate applications such as Healthmap ( This can also speed up contact tracing by AI pattern recognition within the data, in addition to evaluation and optimization of IPC strategies to prevent further cross infection. Likewise, the AI platform Bluedot (, which includes air travel data, uses natural language processing and machine learning to process vast unstructured text data, in multiple languages, to track outbreaks of over 100 different diseases. Bluedot first alerted on COVID-19 on December 31, 2019, almost a week ahead of national surveillance centers and the WHO. The major advantages of these applications in global outbreaks will also be true for IPC teams at local level, whereby AI data analytics enable IPC and public health experts to focus on strategies to minimize cross infection rather than data gathering and organization into reports.

Diagnosis of Infection with IPC Implication

Chest radiography is a fundamental component of tuberculosis (TB) screening and diagnosis programs in both community and hospital settings. Improvements in TB detection enable timely instigation of anti-TB therapy and appropriate IPC precautions. AI offers the opportunity to standardize and improve this process, especially in TB-prevalent regions with suboptimal access to radiologists. Deep learning with convolutional neural networks has been used to classify TB on chest radiography with discrepant results reviewed by a radiologist [17]. This type of process offers opportunities for the developing world and other regions that lack radiologist expertise, whereby AI could interpret the majority of investigations with radiologist review of equivocal cases only.

In the clinical microbiology laboratory, machine learning algorithms developed from population genomics could be used to predict infection risks from the genomic features of Staphylococcus epidermidis and potentially identify high-risk genotypes preoperatively to target pre and postoperative HAI preventative programs [18]. AI-enhanced laboratory microscopy could streamline the rapid diagnosis of patients with infection and assist AMR prevention programs by facilitating targeted antimicrobial management and IPC intervention. In one proof of concept study, a convolutional neural network (a type of AI used to analyze visual data) was trained to categorize bacteria in blood culture specimens at the gram stain stage with over 90% accuracy [19•]. Gram stain interpretation can be time-consuming, is strongly operator dependent, and requires a skilled laboratory scientist for interpretation. AI-assisted microscopy opens possibilities for areas without clinical microbiologist expertise with the potential to send images to a central facility for review and appropriate clinical liaison regarding patient management. Machine learning has also been employed in the clinical microbiology laboratory for molecular diagnosis of bacterial vaginosis and performed well against traditional gram stain testing [20]. The concept of AI-based microscopy could therefore be extended to other specimens that require gram stain interpretation (e.g., CSF of patients with presumed meningitis), other pathogens such as TB where microscopic diagnosis is an important element of the IPC pathway and to molecular diagnosis of AMR pathogens with IPC implications.

Hand Hygiene

Hand hygiene is a fundamental component of an IPC program [4] and AI applications for hand hygiene education and audit offer opportunities to improve compliance and streamline IPC processes. The SureWash system is a commercially available interactive kiosk that uses camera-based augmented reality and gamified learning to train and assess hand hygiene technique with resultant improvements in compliance [21]. The kiosk is mobile and when left in one area can be used independently by healthcare staff at a time that suits them, to obtain immediate and individualized performance feedback. More recently the system has been developed into a smartphone app with similar functionality. A pilot of an integrated hand hygiene digital framework which included the SureWash system along with a hand hygiene auditing tool and an activity monitoring system demonstrated the feasibility of using AI without impairing clinical workflow [22]. An integrated “risk status” metric based on live data was presented pictorially in a variety of formats to staff and included actions required to improve the score. Expansion of this dataset to include HAI surveillance data could potentially be incorporated into an AI application to predict future outbreaks and suggest IPC interventions. In an outpatient setting, while machine learning with feedback was associated with improvements in staff hand hygiene before first patient contact, concerns regarding accuracy, long-term sustainability, and user fatigue from repeated notifications were noted [23•].

Computer vision is a branch of AI that studies how to automatically understand the content of images and video in a human-like manner. In a simulated clinical environment using computer vision with depth images (where a person appears as an outline without distinguishing individual features) for hand hygiene auditing, the system was more successful in detection of alcohol hand rub dispensing and moment one, than detection of hand rubbing [24]. The importance of incorporation of real-time feedback into AI applications to deliver behavior change has been recently highlighted [25••]. In this study on a surgical ward, automatic video auditing (sink vision monitors) with feedback resulted in improvement in the quality and quantity of handwashing. However performance returned to baseline when feedback was removed. Care must therefore be taken when reminders are provided automatically as staff can become dependent (cognitive offloading); [26] hence, why when hand hygiene reminders and measurement tools are removed, performance returns to baseline [25••, 27]. Other issues for vision-based systems include privacy (limiting the visual data that can be used, though “edge AI” where the images and identity information are processed locally by an AI on the device may circumvent this), challenges around hospital infrastructure making trajectory prediction and data association difficult, and the unresolved issue of how best to take account of people traffic in real life rather than in the quieter simulation setting [28].

The potential health benefits of wearable technology are being increasingly described in healthcare, and in IPC offer the opportunity to develop machine learning applications from them to support healthcare staff IPC education, audits, and potentially behavior change. In one review of technological behavior monitoring systems, hand hygiene improvement varied widely from 6 to 54%, though study results were less clear-cut around sustainability of the improvement [29]. As wearable technologies require staff to continuously wear the technology, user attitude, device functionality, and usability are important factors to consider prior to further development of AI applications around them. For example, if a device is programmed to give the user an auditory or visual reminder, then the ability to override this in particular clinical situations, such as palliative care, is important. Likewise, device design, size, or weight may be perceived as a barrier to providing patient care, or indeed an IPC risk itself given most commercially available wearables which are designed to be worn on the wrist.


AI presents many potential advantages for IPC including speed, consistency, and capability of handling infinitely large datasets; however, many challenges remain. Most studies to date assess performance retrospectively so there is a need for prospective evaluation in the real-life often chaotic clinical setting. AI is highly dependent on data quality and completeness, robust reference standards (which frequently do not exist in IPC), in addition to close collaboration with IPC experts to interpret outputs and ensure clinical relevance. Otherwise, errors that are introduced during the machine learning training process can result in false negatives, misclassification, or lack of applicability. IPC practitioners also need to understand the limitation of AI for a particular application and context. Depending on how data is collected and the learning algorithms are designed, machine learning results can poorly classify new data (under-fitting) or lose the ability to recognize similar patterns in new data (overfitting). They may also reflect the underlying bias in the training data.

At present, health data is held in a range of locations both in hospitals and community settings and on patient devices such as smartphones and wearables. In many cases, healthcare facilities and indeed individual departments have developed their own bespoke data infrastructure involving multiple merchants. Ideally to achieve a comprehensive view, an AI application may require access to data from all or a variety of these “data silos” both within and across disparate healthcare organizations [30]. However, most publications on AI in healthcare do not work across this continuum, rather focus on discrete more manageable areas. Indeed, IPC and healthcare generally need to become less fragmented to access these technologies. This is essential to train AI applications’ appropriate for a particular context and ensures that the algorithms perform consistently across patient cohorts, especially those who may not have been adequately represented in the training set.

Other concerns include data ownership, privacy, and data exploitation for commercial or political advantage [3••, 2•]. One proposed solution would be for patients themselves to control their own data and then provide consent for their data to be used to develop AI applications [3••]. Public discussion, guidelines, and potential regulation will be required to guide the safe development, use, and oversight of AI applications, ensuring that an individual’s privacy is considered alongside access by the healthcare system to guide public health and IPC interventions.


The potential for AI applications to improve IPC is huge; however, AI in itself will not improve IPC. Sustainable improvements in IPC require culture and behavior change supported by appropriate governance structures. The consideration that “correlation does not imply causation” is particularly relevant when the use of AIs in healthcare is considered. AIs are driven by “big data” to find the correlations that may indicate medically relevant conditions or to identify potential risk factors. However, AIs can sometimes overlook small clusters that may be clinically relevant and are currently unable to use deep knowledge of the underlying processes to reason about small datasets. Rather than focus on the AI tools themselves, the focus should be on the IPC problem that needs to be addressed with development of strategy, goals, and processes to support this which may include AI. Organizations that have successfully led digital transformations have used this approach, understanding culture and drivers first before choosing appropriate technological tools. In addition, involving insiders that are familiar with the culture, appreciating that one size does not fit all settings, and adopting a flat hierarchy to support rapid iterative modifications are important considerations [31••]. IPC practitioners need to be aware of the limitations and biases within AI and the tendency of staff to off load tasks on to the AI and be over confident in its abilities. Issues around privacy and data ownership require careful consideration; AI applications need to be tested and integrated into real-life clinical practice and for most healthcare settings, significant investment in data infrastructure is required to truly realize its potential.