Artificial intelligence (AI) is expected to broadly reshape medicine; however, the vast majority of developed AI models in the intensive care unit (ICU) still remains within the testing and prototyping phase [1,2,3]. AI, defined as a machine’s ability to mimic human-like capabilities such as reasoning, learning, planning, and creativity [4], faces several adoption challenges in clinical settings. One of the primary issues is that the integration of AI into clinical practice encounters challenges, including concerns over data privacy, sharing, transparency, and explainability [5]. These are crucial to overcome because recent advancements have led to the development of sophisticated AI models capable of diverse tasks, ranging from text interpretation to image generation. For optimal performance, such advanced models, also known as foundation models, require training on extensive and diverse datasets. A well-known example of a foundation model is ChatGPT, released by OpenAI in 2022, able to generate human-like natural language responses that are more empathetic to patient questions compared to clinicians and can assist ICU clinicians with tasks such as summarizing unstructured medical notes and preparing accurate discharge summaries [6, 7]. This however can only be achieved when data are shared among healthcare providers and institutions to achieve proper volume, is standardized, and is de-identified. The process of anonymizing data, while critical, is not foolproof against AI-driven attacks that can potentially reconstruct sensitive information. Thus, traditional anonymization techniques may fall short in fully protecting patient privacy [8]. Also, acquiring such data is a challenge because healthcare data are siloed within hospitals and data sharing is subject to ethical, organizational, and legal complexities. Currently, the absence of a robust framework for cross-border data sharing in ICUs (and hospitals in general) hinders the standardization of data sharing and access, thereby affecting the effective training and implementation of foundation models.

Federated data access and federated learning

To address these challenges, federated data access and federated learning (FL) offer innovative solutions. Federated data access enables the analysis and extraction of insights from health data without the need for its physical transfer. This methodology enables the aggregation of diverse data sets from multiple sources while ensuring that each dataset remains securely within its original location. Consequently, it not only safeguards privacy but also breaks down the barriers created by data silos. Building upon this, FL extends the capabilities of federated data access by enabling AI models to be trained and refined directly within this secure data infrastructure. This approach allows for data analysis and model training to occur at the data’s source, circumventing common data privacy concerns in healthcare. This makes FL a crucial tool in enhancing patient care and medical research. Collaborative data access and model training are orchestrated via a central server (e.g., a service provider). Hospitals have the flexibility to utilize their own infrastructure or a virtual private cloud for data storage and facilitating model training (Fig. 1) [9]. Imagine a network of ICUs across various hospitals, each accumulating vital patient data and providing a large, diverse medical dataset. For a model to be trained with such data, all data across the federated data network need to be harmonized to facilitate that data elements have a similar structure and meaning. Common data models like the Observational Medical Outcomes Partnership (OMOP) facilitate data harmonization, offering a standardized framework that can be used to map raw data from various sources into a common structure. This facilitates data harmonization and the possibility to train models on large amounts and diverse patient data from multiple institutions, eventually improving the model [10]. In a FL framework, each data controller not only defines its own governance processes and associated privacy policies, but also controls data access and has the ability to revoke it. This ensures that local ethical standards, organizational policies, and legal requirements can be met without the need of fully harmonizing these across many countries and institutions [11].

Fig. 1
figure 1

Schematic representation of federated data access in healthcare settings. This figure depicts the operational framework of federated data access. This paradigm facilitates the application of computational models directly at the data’s origin points—designated as nodes—which are typically healthcare institutions or data repositories. Collaborative enhancement of the model occurs through these nodes, with a critical emphasis on maintaining data confidentiality; raw data are not transferred or disclosed. This is aligned with privacy regulations. Datasets are built according to international common data standards and therefore are standardized. This decentralization methodology enhances disease surveillance and knowledge dissemination in ICU settings

Federated learning and bedside support

Imagine a night shift at the ICU, where an alert is raised stating, “Caution: patient X has an increased risk of deteriorating towards septic shock within the next 24 h at 78% likelihood.” Further interaction with the bedside monitoring model is possible, with queries like, “[healthcare professional] why was this alert issued?” and the model responding, “[model] the patient’s respiration rate and heart rate have increased over the last hour, and blood cultures have just returned positive results.” This example demonstrates the advanced functionality of a foundation model. For such advanced predictive functionality to be realized, the model must be trained on large, diverse medical datasets. Also, a foundation model must be fine-tuned with validated prompts and must be ‘grounded’ to local data prior to verification and testing and ultimately deployment to the ICU. Grounding involves taking the pre-trained foundation model and tailoring it to address specific real-world challenges and tasks. As such AI models will offer bedside decision support by harnessing clinical expertise and delivering comprehensive textual explanations and data summaries.

Advantages of federated learning

One of the benefits of FL is its capability to enable the swift and real-time analysis of diverse, sensitive clinical data. This feature supports the local implementation of foundation models, allowing them to be continually updated with the most recent data from a variety of sources. Such dynamic updating enables the models to rapidly adapt to evolving clinical situations, providing more accurate and timely support for critical decision-making. This is particularly crucial in ICU settings. In broader healthcare scenarios like pandemics, different institutions possess unique knowledge, resources, or datasets that are vital for an effective response. FL allows these institutions to contribute their specialized expertise and data while retaining control over it. The decentralized nature of FL is instrumental in developing responsive AI models and strategies, thus improving our collective capacity to manage emergency health crises.

Challenges of federated learning

Despite its promising approach to data privacy, FL does not completely resolve data governance issues. Instead, robust data governance becomes a fundamental requirement for enabling effective FL. To begin with, data sharing and the adoption of common data models necessitate that all participating data providers have well-defined data management policies. These policies should ensure data lineage, uphold high data quality, and establish clear responsibility and accountability for the data. Furthermore, robust agreements for the type of standardization across hospitals are essential. Institutions must establish consensus on a unified data model and semantic standards for coding key concepts like diagnoses, medications, and clinical lab results. Additionally, technical limitations related to the availability of hardware and cloud resources for data storage, as well as computational power for model training, may pose challenges for hospitals in adopting federated learning. These technical constraints can be barriers for some hospitals in adopting FL, particularly those with limited IT infrastructure. Moreover, and perhaps more importantly, because their output reflects their training data, foundation models can perpetuate biases due to disparities in gender, race, and socio-economic status. Ensuring an adequate representation of hospitals from various regions worldwide could lead to more diverse and inclusive health datasets.

Data-driven ICU medicine

Currently, most ICU models are trained on small, narrowly scoped clinical datasets and are evaluated on tasks that do not provide meaningful insights on their usefulness to health systems [12]. A recent successful example of FL in the ICU is a collaborative effort across 20 global institutes to predict clinical outcomes in patients with coronavirus disease 2019 [13]. Despite this, clinical examples are limited and foundation models in the ICU remain a vision until (cross-border) data access between various institutions to enable proper model development is achieved. To facilitate such a network, the European Commission issued a tender to initiate a pan-European level ICU data infrastructure [14]. We are currently bringing together different stakeholders to create a federated infrastructure for ICU data across Europe, with the European Society of Intensive Care Medicine as key stakeholder, and welcome expressions of interest from professionals with diverse backgrounds and an interest in ICU.

Take-home message

FL is a machine learning setting where a model is collaboratively trained under the orchestration of a central server, while keeping the training data decentralized. It is key to the development of foundation models for healthcare. In the ICU field this innovative approach ushers in a future marked by safer, more effective, and globally interconnected healthcare. This paradigm shift ensures that data are standardized, privacy is preserved, regulatory compliance is maintained, and healthcare institutions retain control over their invaluable patient data, while disease detection and knowledge sharing is enhanced.