We read with great interest the article by van Genderen et al. [1] which provides a contemporary and comprehensive overview of the potential of federating data access and data sharing in intensive care. Importantly, the authors list the perpetuation of biases encoded in clinical care practice as a major potential shortcoming. Furthermore, they state that this could be mitigated by “ensuring an adequate representation of hospitals from various regions worldwide could lead to more diverse and inclusive health datasets.”

We agree that the use of diverse and inclusive health datasets should be promoted as a necessary first step to build fair machine learning algorithms. However, we do not believe that this will be sufficient to overcome the deeply embedded biases in medicine from a knowledge system that is designed around a majoritized few. Even with high quality data from the intensive care units from across the world, the social patterning of the data generation process can still produce artificial intelligence (AI) that is bound to preserve and even scale existing disparities in care with resulting inequities in patient outcomes. There are numerous examples of data issues that stem from the social patterning of the data capture and data generation process (Fig. 1). These include, but are certainly not limited to, (1) the differential performance of medical devices used to measure physiologic signals across patient populations of which the pulse oximeter is just the tip of the iceberg [2]; (2) variation in the frequency of testing across patient populations that is not explained by clinical factors [3]; and (3) disparities in the performance of routine care that is typically assumed to be administered uniformly across patient populations [4]. These data issues are unlikely going to be discovered even by teams in the dozens, or even the hundreds, as they require a level of cognitive diversity that is not leveraged in federated learning. It is unlikely that individual hospitals, especially outside of the large academic centers, will have the large interdisciplinary teams necessary to understand embedded bias in the electronic health records.

Fig. 1
figure 1

Illustration of the effect of embedded biases in different hospitals and health systems on machine learning algorithms

Federated learning promises model development without data sharing to preserve patient privacy. This comes at a steep cost: undiscovered data issues that lead to spurious associations that are learned by a model and that are incorporated into an algorithm. We believe that no one group is smart enough to discover all the data issues to build fair models. If a group was to claim such a skill it would be AI in action: arrogance and ignorance. The promise/hype of AI will only translate into huge dividends if the intensive care community works with computer scientists, social scientists, patients and their caregivers in understanding the backstories of the data and designing an equity-focused curation and analytics pipeline.