Introduction

Medication regimens of critically ill patients in the intensive care unit (ICU) are complex and heterogeneous1,2. This heterogeneity of medication regimens has parallels to the common and lethal disease states of critical illness including sepsis and acute respiratory distress syndrome (ARDS)3,4. Managing the heterogeneity of critical illness is a nearly universally cited challenge for ICU clinicians and researchers5,6. Phenotyping has been proposed to identify patterns of diagnosis and treatment response among these complex heterogenous syndromes7,8,9. In particular, phenotyping via artificial intelligence (AI) and machine learning (ML) has demonstrated potential to be a powerful methodology to handle Big Data generated by critically ill patients for identification of novel patient subgroups and prediction of patient outcomes including sepsis, acute kidney injury, mechanical ventilation, ARDS, and more10,11,12,13,14,15,16,17. However, to date, this methodology has only been applied in a limited fashion to the highly complex and heterogenous nature of ICU medication regimens1.

Critically ill patients are often prescribed greater than 20 medications, with many deemed high-risk for patient harm by the Institute of Safe Medication Practices18,19,20,21. Further, it has been estimated that each day, a critically ill patient will suffer at least one medication related error. These medication related errors can lead to serious adverse drug events associated with a doubled risk of mortality20,21. Medication therapy optimization has significant potential to improve patient outcomes and reduce healthcare costs1,2. Thus, the development of novel prediction models with granular medication information to predict adverse events and direct resources is warranted1. However, identifying patterns associating medication therapy with patient outcomes within the vast amounts of data generated by ICU patients has remained a challenge, and to date, no AI/ML models have incorporated comprehensive ICU medication regimens into their analyses22.

We hypothesized that a similar approach as has been explored with other disease states of critical illness could be applied to ICU medications. Here, we sought to identify novel pharmacophenotypes using unsupervised machine learning to cluster medications used in the ICU and explore their relationship to patient-centered outcomes.

Methods

Study sample

Patients were drawn from the University of North Carolina Health System, an integrated healthcare delivery system where clinical care is managed via a comprehensive electronic health record (EHR). Patients were included if they were ≥ 18 years old with an ICU admission greater than 24 h between October 2015 and October 2020. ICUs included medical, surgical, neurosciences, and burn specialties. The hospitals varied including community hospitals and academic medical centers. Only the index ICU admission per each patient was considered in this analysis. The institutional review board at The University of Georgia approved this study and included waiver of consent (PROJECT00002652), and all methods were performed in accordance with the relevant guidelines and regulations.

The EHR was queried for patient demographics, medication information, and patient outcomes. Patient demographics included age, sex, admission diagnosis, ICU type, and Acute Physiology and Chronic Health Evaluation II. Medication information including drug, dose, route, duration, and timing of administration were recorded. Patient outcomes included mortality, hospital length of stay, development of delirium (defined by a CAM-ICU positive score), duration of mechanical ventilation, duration of vasopressor use, and acute kidney injury (defined by the presence of renal replacement therapy or a serum creatinine greater than 1.5 × baseline).

Feature extraction

Patient demographics

There were 30,550 given medication entries in the dataset from a total of 991 patients. Of these 30,550 administered medications, there were 440 unique medications when the filter of generic drug names was used and when dose and route information were excluded (e.g., cefepime 1gm and 2gm were counted under the feature of cefepime). Medication records from the raw dataset included a variety of medication administration record (MAR) actions including “given”, “missed”, “hold,” etc. To ensure this analysis only included records of medication that were administered to the patient (not just ordered), only the entries where the medication action label corresponded to "Given", "New Bag", "Restarted," or "Rate Change" were used for the analysis. Some entries contained "free-text" for ICU personnel communication purposes and were discarded. Additionally, duplicate and incomplete entries were filtered out. After cleaning the dataset, the data were transformed into a binary (boolean) vectored form where the 440 unique medications were assigned as the rows, and 991 patients were assigned as the columns. For each patient, a binary value of 1 was assigned to indicate whether the patient received a particular drug. For patient outcomes, the labels for categorical features were relabeled as numeric values. In the cases of unknown or missing entities, these were replaced with “negative” or “no.” The entire mapping of original labels to new labels is provided in Appendix Table 1.

Unsupervised learning approach

Medication clustering

After performing principal component analysis (PCA) on the large, binary medication dataset, the Restricted Boltzmann Machine (RBM) was used to further enrich the latent feature space, which was use as input to the hierarchical clustering algorithm to support the novel discovery of unique pharmacotherapy profiles23.

Principal component analysis

During PCA, each of the 440 unique medications was treated as an independent variable. PCA is a widely used dimensionality reduction technique to reduce the dimensionality of a dataset with p random variables to q, which is the desired number of variables24. The optimal number of principal components was selected after plotting the explained variance against the number of principal components (see Appendix Fig. 1). The number of principal components was selected as 150 to maintain sufficient variance (approximately 75%) in the data while significantly reducing the dimensionality.

Restricted Boltzmann Machine

RBM was used to learn unsupervised feature abstractions or ‘latent factors’ of the PCA reduced data25. RBM is a simple, two-layered neural network with one visible layer and one hidden layer. It is typically used for collaborative filtering as RBM is capable of learning internal representations of the input variables using unsupervised methods enabling complex relationships to be discovered in the process. For medication clustering purposes, we trained the RBM25,26 to learn the high dimensional and non-linear nature among medication assignments based on the co-occurrence of medications for each patient. The default hyperparameters for implementation were used based on the works of Chen26. From each patient’s binary assignment of medications, the RBM learned the weight coefficients to ultimately determine which nodes out of all nodes were activated or inactivated for each hidden unit. For clustering purposes, each medication is an independent node from the visible layer (440 units), and connections that are activated to a single hidden layer indicate cluster assignment (see Fig. 1). For example, if acetaminophen (from the visible layer) and Cluster 1 (from the hidden layer) connection was activated, acetaminophen would be assigned to Cluster 1. After assigning medications to each cluster from the created hidden layers, medications that were unassigned (never activated in the five hidden layers) were grouped as Cluster 6. Table 1 lists the medications assigned to Clusters 1–5, and Table 2 lists the unassigned medications in Cluster 6.

Figure 1
figure 1

Pharmacophenotype derivation workflow. (a) When medications are ordered by the clinician for ICU patients, all administered medications are recorded and stored in the electronic health record (EHR) system. (b) The medication data from the EHR was preprocessed to create a binary indicator matrix that contains all unique medications taken by a total of 991 patients. (c) Five medication clusters were created using unsupervised learning model (Restricted Boltzmann Machine). The layers that are not turned “on” (indicated in orange) to any hidden layers are grouped as an extra sixth cluster. (d) For each patient, the frequency of each medication cluster was counted and normalized by the total medications taken by each patient during their stay. (e) The normalized medication cluster distribution of each patient is used as a feature to agglomerative hierarchical clustering to develop novel pharmacophenotypes of critically ill patients. (f) These novel pharmacophenotypes can be used to predict clinical outcomes of new patients based on their medication regimens.

Table 1 Medication clusters assigned by restricted boltzman machine.
Table 2 Cluster 6—medications unassigned through restricted boltzman machine.

II. Patient clustering

For each patient, the frequency of each medication cluster was counted (see Fig. 1). To obtain a normalized medication cluster distribution for each patient, the frequency table was normalized by the total number of medications taken by each patient. This normalized medication cluster distribution was used as a derived feature for patient clustering.

Hierarchical agglomerate clustering

The normalized medication cluster distribution was used to cluster patients using Hierarchical Agglomerative Clustering, which builds a tree to represent data with successor nodes27. The optimal number of clusters (n = 5) was identified through the use of the unsupervised pipeline, including visual inspection of the dendrogram (see Fig. 1) and silhouette scores analysis, which was used to identify cluster numbers that provided an equal width among clusters where all clusters are found to have an above average silhouette score (see Appendix Fig. 2). Table 3 describes relevant demographic and outcomes information for each cluster. For implementation, scikit-learn 1.0.2 python library was used to obtain a total of five cluster labels.

Table 3 Demographic characteristics by patient cluster.
Validation of clusters

Upon selection of the optimal number of clusters, the validity of these clusters as clinically meaningful subgroups was assessed via surrogate validation conducted by comparing patient outcomes with medication data to see if clinically relevant characteristics were distinguishable.

Wilcoxon rank sum and signed rank tests were performed for continuous characteristics. Fisher's Exact tests were performed for categorical characteristics. Holm’s adjustment of p-values was applied to the comparisons within each outcome to control the familywise error rates. Permutation multivariate analysis of variance (MANOVA) was also used to confirm if the clusters were significantly different considering all clinical outcomes simultaneously28. Significance was assessed at p-value < 0.05.

Results

From the original 1000 patients, a total of 991 patients were included in the analysis with nine excluded due to being repeat ICU admissions. Demographic features are summarized in Table 4 with additional information about the health system provided in Appendix Table 2. The average was 61.2 years old (SD 17.5) with 43% female sex. The patients were managed in the medical ICU 40.7% of the time followed by 9.8% in the surgical and 9.4% in the neurosciences ICU. The mean APACHE II score at 24 h was 14.2 (SD 6.3). The frequency of use for each medication in the analysis is provided in Appendix Table 3, with the top ten medications used including sodium chloride, acetaminophen, potassium chloride, heparin, fentanyl, magnesium sulfate, insulin, furosemide, pantoprazole, and vancomycin.

Table 4 Summary of patient population.

Comparison of patient and medication clusters

Five patient clusters were identified through the use of the unsupervised pipeline. Additionally, the silhouette scores analysis plot further suggested a cluster number of 5 provides an equal width between clusters with all clusters having an above average silhouette score (see Appendix Fig. 2). Table 3 describes relevant demographic and outcomes information for each cluster, and Fig. 1 provides a visualization of the distribution of patient clusters by medication clusters and patient outcomes, with lower mean values indicating less severe outcomes. Patient Cluster 1 had a well-rounded distribution overall when compared to other patient clusters and did not have any distinctive distribution for a particular medication cluster. In contrast, Patient Cluster 4 had a high distribution in Medication Cluster 6. Figure 2 summarizes the mean medication cluster distribution for each patient cluster, with the mean medication cluster distribution for each patient cluster provided in Appendix Table 4.

Figure 2
figure 2

Radial plot distributions in each patient cluster. (a) Radial plot of the mean medication cluster distribution in each patient cluster. Patient Cluster 1 has a well-rounded distribution overall when compared to other patient clusters without any outstanding distribution of a particular medication cluster comparably. In contrast, Patient Cluster 4 notably has a high distribution in Medication Cluster 6. (b) Radial plot of the mean clinical outcomes in each patient cluster. The lower the mean value, the less severe the outcome was for each clinical outcome category. Thus, Patient Cluster 3 and 5 can be interpreted to have the least serious outcomes while Patient Cluster 2 and 4 generally had worse outcomes.

Comparison of patient clusters by clinical outcomes

Patient Cluster 3 and 5 had the least serious outcomes while Patient Cluster 2 and 4 generally had worse patient outcomes. Box plots of outcomes by patient clusters are presented in Fig. 3. For medication clustering purposes, we trained the RBM25,26 to learn the high dimensional and non-linear nature among medication assignments based on the co-occurrence of medications for each patient. The default hyperparameters for implementation were used based on the works of Chen26. A notable finding was that Patient Clusters 2 and 4 had no statistically significant difference in ICU length of stay, duration of mechanical ventilation, or duration of vasopressor use, but their mortality differed significantly (9.0% vs. 21.9%, p < 0.0018). Patient Cluster 4 had a mortality rate of 21.9% compared with the rest of the clusters ranging between 2.5 and 9% (see Fig. 4). Patient Cluster 4 also had the highest number of outliers (see Appendix Fig. 3). The difference of ICU duration between Patient Clusters 1 and 5 and Patient Clusters 2 and 4 were statistically insignificant. Significance of the differences between patient clusters are summarized in Table 5. Permutation MANOVA further confirmed these differences (p < 0.001) (see Appendix Table 5).

Figure 3
figure 3

Boxplots of MRC-ICU, APACHE II, and patient outcomes by patient cluster. (a) MRC-ICU score evaluated at 24 h. (b) APACHE score evaluated at 24 h. (c) Total days of vasopressor support patient received during admission. (d) Total days patient was on mechanical ventilation. (e) total days in the ICU. For panel d and e, outliers have been removed to improve visibility of the distribution (full box plots are available in the Appendix).

Figure 4
figure 4

Stacked bar plots showing proportion of patient outcome (categorical) by patient cluster. Any patients with unknown or unreported outcome were removed for analysis.

Table 5 Pair-wise comparison of differences in patient outcomes by patient cluster.

Discussion

In the first unsupervised machine learning analysis of critically ill patients and their medication regimens, five unique patient clusters were identified with significant differences in severity of illness and outcomes. Six pharmacophenotypes were identified, and each patient cluster displayed a unique distribution of these six pharmacophenotypes. This study is the first to apply AI to the complete medication list of ICU patients and demonstrates the ability to appropriately categorize patients with their outcomes, which lays the groundwork for future investigations.

Unsupervised machine learning methods have been previously explored for the derivation of distinct clinical phenotypes and biological endotypes29,30,31. Prior approaches have frequently used methods such as Latent Class Analysis (LCA) to identify clusters that are separatable by the input data. Latent Class Analysis is a set of Finite Mixture Models, which utilize a probablistic model-based clustering approach, in which each cluster are characterized on a probabilistic distribution rather than their centroid-based distance (such as with k-Means). Thus, each cluster has a probability of association, rather than a clear membership assignment. Due to the probabilistic nature of the class assignment, it may be difficult to derive instance-level associations, thus a single instance may belong marginally to multiple classes32,33. Alternatively, k-means allows for a characterization of clusters driven by centroid-based distances, allowing for a quantitative estimate of the membership34. Due to the heterogeneity of the input data, our goal in this work was to distinguish between a finite set of classes and better understand their distance-based profile when medications are utilized in the derivation rather than a probabilistic model of their likelihood.

Critically ill patients are medically complex with requisitely complex medication regimens. The significant challenges to characterizing complex, heterogeneous ICU medications in a meaningful way to drive clinical decision making parallel the challenges of managing and researching complex ICU syndromes like ARDS and sepsis. Indeed, it was reported that 62 of 76 randomized-controlled trials evaluating mortality showed no significant difference and just three of those positive studies have been accepted into practice35. Similar findings have been paralleled in ARDS36. Thoughtful editorials on this statistically unlikely preponderance of negative results have been published, and although common reasons for negative ICU studies likely account for some of these negative trials (e.g., underpowered studies, need for the use of a more conservative p-value cut-off), these statistical explanations ignore the potentially biological ones, wherein the target of an intervention is absent due to limitations in specificity of diagnosis, animal models of disease, or understanding of underlying pathophysiology37,38,39. Additionally, we would like to propose another relevant driver of patient outcomes that is generally unaccounted for in both RCTs and predictive modeling studies: the complete ICU medication regimen. Traditionally, ICU medications are often thought to be direct results of critical illness (e.g., a septic patient with a high lactate is prescribed broad-spectrum antibiotics and vasopressors). However, this simplified pathway does not incorporate that ICU medications are also independent risk factors for ICU complications that worsen patient outcomes (e.g., this septic patient develops acute kidney injury, which may be due to the shock state or the use of nephrotoxic medications or the combination of disease plus medication). Thus, when making medication-related decisions, medications must be thought of as both treatments and causes of outcomes (see Fig. 5). Aside from overt medication errors, ICU medications are also associated with significant ICU complications that increase risk of mortality and length of stay including ICU delirium, fluid overload, acute kidney injury, etc40,41,42,43. Ultimately, the benefits to medications used to manage critical illness must be balanced by mitigating the harms of those same treatments. Because medications in the ICU are always used in combination with other medications and interventions, identifying which medication and which medication combinations confer less risk for ICU complications has the potential to guide safer medication use. However, the dynamic relationships among patients, medications, and outcomes have been difficult to delineate given the inherent complexities and largess of ICU patient care and the data generated in that process. Given that medications are independent risk factors, a future direction for this type of clustering analysis is to generate a dataset capable of matching patients by demographic and clinical presentation variables and then compare outcomes of those with similar vs. different pharmacophenotypes. Moreover, this type of analysis will be aided by a multi-center design that improves external generalizability given that medication regimens may reflect local practices or other healthcare team related origins, instead of purely patient pathophysiology.

Figure 5
figure 5

Patient–Treatment–Outcome Pathway. The unique interactions of medication interventions with patient disease must be accounted for when predicting or studying patient-centered outcomes.

Phenotyping, especially when conducted through artificial intelligence methods, has significant potential to overcome challenges related to heterogeneity and non-linear relationships present in critically ill populations. When Calfee et al. used biomarker-based phenotyping in a re-analysis of a large randomized-controlled trial evaluating simvastatin (a trial that notably had previously shown negative results), significantly different treatment response wherein one phenotype showed mortality benefit from simvastatin was observed44. Moreover, these ARDS phenotypes also showed differential treatment response to fluid management strategy45. Similarly, AI methods have demonstrated the presence of unique clusters in shock, sepsis, and fluid overload46,47. Notably, Seymour et al. demonstrated that the results of three major randomized controlled trials were sensitive to the sepsis phenotypes they derived via unsupervised machine learning methods. Another series of shock sub-phenotypes was characterized by features associated with common ICU interventions (e.g., “well resuscitated” or “still hypovolemic”) that upon appropriate validation could yield highly relevant insights for bedside decision-making47. Our cluster pipeline driven by unsupervised feature learning using RBM and hierarchical clustering categorized medications into five unique clusters, with the remaining medications creating a sixth category. Of the patient clusters, Clusters 2 and 4 had the highest acuity, as measured by APACHE II. This high acuity was accompanied by significantly worse outcomes, including length of stay, ICU length of stay, presence of delirium and fluid overload, and need for mechanical ventilation. Interestingly, despite being similar, Patient Cluster 4 had a mortality rate over twice as high as Cluster 2. When evaluating the distribution of pharmacophenotypes, Cluster 4 had the highest density of Medication Cluster 6 and limited representation among the other five clusters. This particular pharmacophenotype contains many of the medications classically associated with ICU care including vasopressors and broad-spectrum antibiotics. Conversely, Cluster 3 had the lowest severity of illness and best outcomes and also had the lowest density of all the pharmacophenotypes. This suggests a possibility of non-linear relationships between medication regimen complexity and outcomes seen in other analyses48. Medication regimen complexity, as measured by the MRC-ICU, has been previously incorporated into ML prediction models along with other relevant patient characteristics and resulted in improved mortality prediction in a small cohort of patients49. In this study, medication regimen complexity was highest in Patient Clusters 2 and 4, which is in line with previous investigations of MRC-ICU that used traditional inferential statistics to demonstrate a relationship between increasing medication regimen complexity and increased mortality, length of stay, and fluid overload as well as increased need for critical care pharmacist interventions to optimize the medication regimens50,51,52,53,54,55. Taken together, the methodologies in this study appear to be able to appropriately group degree of critical illness (i.e., severity) with degree of intervention intensity (e.g., mechanical ventilation, medications) with patient outcomes (e.g., mortality). This congruence sets the foundation for future investigations to predict ICU complications based on unique medication combinations that deleteriously affect patient outcomes.

Overall, this evaluation was a proof-of-concept investigation to explore how unsupervised clustering methods may be applied to ICU medications, and while it has novel implications, future evaluations to address certain limitations are warranted that include comparative approaches, larger datasets, and more granular medication information. Comparative evaluations may include matrix factorization or other robust forms of RBM (e.g., Gaussian-Bernoulli RBM)23. Only generic drug name was used to describe the medications with dose, route, and other formulation information excluded. Establishing uniform means of describing and comparing ICU medication dosing strategies (e.g., a common data model) and validating these approaches in external datasets remains an area of future work. We assumed homogeneity across medication regimens; however, in practice this may be a highly complex and noisy interaction: therefore, in future work, we seek to utilize Trust Discover platforms to generalize pharmacotherapy profiles that are normalized independent of clinician and institutional bias56. Causal inference cannot be assessed by the current study, so it is unknown whether the high mortality observed in Patient Cluster 4 was partly caused by the unique distribution of pharmacophenotypes versus other factors (although notably, Cluster 4 shared similarities among groups). Even with these limitations, this analysis marks the first time the complete medication profile has been incorporated into outcomes analysis for ICU patients. Future analyses with more granular pharmacophenotype groupings or more programmed directives incorporating data from a myriad of ICUs and centers may improve the face validity form the viewpoint of the clinician for these pharmacophenotypes.

Conclusion

The medication regimens of critically ill patients have unique pharmacophenotypes. Given the significant role of medication therapy in patient outcomes, delineating the complex relationships among patients, medications, and outcomes using artificial intelligence warrants future investigation.