In traumatic brain injury (TBI), the inability of the Glasgow Coma Scale (GCS) to capture the inherent heterogeneity of the disease may provide some explanation as to why randomised trials of biologically plausible therapies have largely failed. Such considerations provide strong motivation for the development of precision medicine approaches in this domain to improve outcome [1,2,3]. Subgroups of patients with distinct pathophysiological or pathobiological mechanisms—so-called endotypes—can be sought as a step to identifying individualised treatments. This can be done by data-driven approaches such as using unsupervised clustering algorithms.

Two of the most important contributions to identifying endotypes in the intensive care unit (ICU) population are by Calfee and Seymour. Calfee identified subgroups of patients with acute respiratory distress syndrome (ARDS) showing distinct inflammatory profiles by performing latent class analysis on patient data from two previous studies with negative outcome. These subgroups were found to respond differently to positive end-expiratory pressure (PEEP) [4]. Seymour did a similar analysis of sepsis patients using consensus k means clustering. He could identify distinct subgroups defined by the inflammatory response which benefited from different fluid management strategies suggesting a substrate for individualised care [5].

Current suggestions for TBI phenotypes

To date, more than twenty-five suggested endotypes and phenotypes (patients who share clinical traits irrespective of whether these relate to underlying mechanistic similarities) in TBI have been published [6], but none has been tested for treatment responses. Most of these focus on the milder spectrum of the disease and post-concussion symptoms, with only a few including patients with severe TBI in the acute phase. These will be discussed in more detail below and are summarised in Table 1.

Table 1 Summary of currently proposed phenotypes in traumatic brain injury

Folweiler et al. [7] have elegantly identified three candidate endotypes across all severities in TBI. Phenotypes defined by the haematological and coagulation factors platelet count, haemoglobin, prothrombin time, international normalized ratio (INR), hematocrit, and glucose were identified by using partitioning around medoids on a dataset containing more than sixty baseline variables. When stratifying these patients by admission GCS, the clear separation pattern of patients disappeared, suggesting these factors are identifying important disease mechanisms not captured by GCS. Yuh et al. [8] performed hierarchical clustering of computed tomography (CT) features in patients with mild TBI, identifying three clusters of intracranial lesions: epidural hematoma (EDH); subdural hematoma (SDH), contusion and subarachnoid haemorrhage (SAH); and intracranial haemorrhage (ICH) and petechial haemorrhage. As 36% of patients admitted to the ICU in an international multicenter study presented with mild TBI, these results may be relevant to the ICU cohort [9].

Within the Collaborative European Neurotrauma Effectiveness Research in TBI (CENTER-TBI) ICU cohort, Åkerlund et al. [10] identified six candidate endotypes based on admission physiology and biochemical markers, using a probabilistic clustering model. These endotypes were distinguished by GCS score and degree of metabolic derangement, including glucose, core temperature, pH, lactate, base excess, arterial partial pressure of carbon dioxide, oxygen saturation, and creatinine. Notably, two different pictures of metabolic derangement emerged, where one was characterised by a general stress response, while the other was associated with extracranial injuries.

By solely focusing on intracranial pressure (ICP) trajectories over time, Jha et al. [11] identified six phenotypic temporal profiles by applying the longitudinal clustering method group-based trajectory means (GBTM). Not only did the trajectories with high ICP show relations with unfavourable outcomes, but so did two trajectories with low ICP levels. Furthermore, the expression of the gene ABCC8 (coding for the sulfonylurea receptor-1, a known edema regulator) was different between the identified groups.

The identified candidate endotypes of TBI all go beyond a description by GCS at presentation, indicating underlying important pathobiological mechanisms. Data-driven unsupervised clustering methods have been used to identify and describe the proposed endotypes, and although no information on outcome was used in the models, all described phenotypes were informative on the outcome. However, they are not hypothesis-free but largely depending on the features included in the models. In addition, different methods have been used, which further may explain the differences in presentation.

The proposed subclassifications of TBI patients are biologically plausible subtraits to individualise treatment. The intracranial lesion phenotypes and the haematological endotypes have been validated in external datasets showing good generalizability. Inclusion of brain biomarkers such as glial fibrillary acidic protein (GFAP) and ubiquitin carboxy-terminal hydrolase L1 (UCH-L1) may further refine the suggested phenotypes to increase the probability of clinical relevance, as these features have shown to improve outcome predictions [12].

Methodological considerations

The phenotypes described above have all been developed using clustering algorithms grouping of patients with similar characteristics with no information on outcome in what are often complex, high-dimensional data. Clustering algorithms are methodologically challenging and identifying a principled number of robust clusters requires substantial effort to do well.

First, one has to decide which clustering algorithm should be used. Relatedly there are a variety of choices for what metric is to be used to define ‘similarity’ between two data points. Whilst to some extent the chosen method may be dictated by the data type, it is important to note that it is not given that two methods will produce the same result. Parameter selection, choice of an ‘optimal’ number of clusters and assuring that the clustering obtained is robust are steps which often receive relatively little attention; yet any inferences are critically dependent on these being done well and sensitivity analysis is vital. Demonstrating robustness to modelling assumptions and external validation are crucial steps. After external validation, restratification according to the proposed subgroups of study cohorts in previous interventional studies should be performed, to investigate treatment effects in the hypothesised groups.

Take-home messages

Clustering has been shown to be a promising technique for discovering endotypes that are biologically both plausible and interesting. However, clustering is methodologically challenging and both authors and peer reviewers should be cognizant of this. Whether data-driven endotypes are clinically useful will depend on whether patients in different clusters respond differently to different treatment choices, as is the case for ARDS and sepsis. This requires prospective assessment (or retrospective evaluation in a prospectively collected interventional dataset) and this has not yet taken place. The endotypes give interesting biological insights, but further studies are needed before we can know whether they can help to improve TBI outcomes.