Introduction

A randomised, double-blind, placebo-controlled, multicentre phase 3 trial in mechanically ventilated patients with coronavirus disease 2019 (COVID-19) (PANAMO trial, NCT04333420) showed that vilobelimab, a monoclonal antibody which specifically binds complement 5a (C5a), reduced all-cause mortality at day 28 from 40 to 31% [1]. The pathophysiology of COVID-19 is an interaction between immune disturbances, endothelial dysfunction, and thromboembolic complications. This complex interaction varies between patients [2], and several clinicians and researchers advocate for a more personalized approach in the treatment of COVID-19 patients [3]. The large differences in host response are reflected through heterogeneity of treatment effect (HTE) of immunomodulatory agents [2].

Using four clinical subtypes previously described in patients presented at the emergency department (ED) with sepsis [4], survival benefit of the use of dexamethasone in COVID-19 patients was only seen in the subtype with the highest inflammation, also called the δ subtype [5]. A similar mortality benefit was observed in critically ill COVID-19 patients treated with corticosteroids and belonging to a different hyperinflammatory subtype [6]. Other immunomodulatory agents, such as imatinib, tocilizumab and anakinra, also showed HTE in COVID-19 [7,8,9]. Nowadays, however, subphenotypes have limited use in clinical practice, partly caused by the complexity in assigning subphenotypes. Using routinely available clinical variables limits this problem. Future studies investigating immunomodulation treatment in COVID-19 might benefit from using patient enrichment to identify those with better treatment response. Moreover, patient enrichment can identify subgroups that may experience adverse effects and can reduce costs by avoiding prescription in patients not likely to benefit from treatment.

In this post hoc analysis of a randomised controlled trial, the aim is to investigate the heterogeneity in vilobelimab’s treatment effect and adverse events in critically ill COVID-19 patients. Routinely measured clinical data was used to identify classes and to assign to known subtypes. We postulate that clusters and subtypes will exhibit differential treatment effect based on differences in inflammation between the clusters.

Methods

Study design and patient selection

This study was conducted as a secondary analysis of the PANAMO trial (NCT04333420) [1]. From October 1, 2020 to October 4, 2021, 369 critically ill COVID-19 patients were included from 46 hospitals in Europe, Africa and North- and South-America. Inclusion criteria were an age of 18 years or older, invasive mechanical ventilation within 48 h before the first infusion of study medication, a PaO2/FiO2 (PF-ratio) of 60–200 mmHg and a confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection in the past 14 days. The complete exclusion criteria can be found in the original report [1]. For the current analysis, 368 patients were included due to random assignment in error in one patient. Treatment emergent adverse events (TEAE) were defined as any event that occurred or worsened at or after the first infusion, with AE defined as any untoward medical occurrence in a patient or clinical study patient, temporally associated with the use of Investigational medicinal product (IMP), whether or not considered related to the IMP [1].

Clustering techniques

All available clinical variables, such as vital signs and hematology, coagulation and chemistry laboratory measurements, were collected at baseline and used as input for unsupervised learning (Supplementary Table 1). The three techniques used were: (1) latent class analysis (LCA), (2) Ward’s hierarchical clustering (HC) and (3) the adjudication to clinical subtypes previously described in patients presented at the ED for sepsis (SENECA subtypes, Supplementary Table 6) [4]. LCA was conducted using the R package ‘Flexmix’. The process of model design and LCA followed the steps and considerations outlined by Sinha et al. [10], see the supplementary methods for more details. Next, Ward’s HC using Monte Carlo reference-based consensus clustering (M3C) was done [11]. M3C constructs a Monte Carlo p-value and Beta distribution p-value to test against the null distribution, which is the existence of a homogeneous cohort and thus no (statistical) clusters exist. The Relative Cluster Stability Index is an additional criterion. If significant classes were found, HTE was analyzed. Lastly, clinical variables were used to identify the subtypes alpha (α), beta (β), gamma (γ), and delta (δ) using the SENECA approach [4] as previously described [12]. In short, all the available variables (16 of the original 29 variables, Supplementary Table 6) were log-transformed (if needed), scaled, centered and used to assign the subtype by Euclidean distance. If variables were not completely missing, multivariate imputation by chained equations (MICE) was used for all three techniques (Supplementary methods).

Statistical analysis

Patient characteristics and outcomes were compared using a t-test or one-way ANOVA for parametric data, a Mann–Whitney U test or Kruskal–Wallis test for nonparametric data, and Chi-square test for categorical data, stratified by cluster. The primary outcome was all-cause mortality at 28 days, the secondary outcome was all-cause mortality at 60 days. To analyse the association of vilobelimab and mortality between the different clinical clusters, patients were categorized based on their randomization arm (vilobelimab or placebo). The treatment effect was analyzed assessing the interaction term between vilobelimab and mortality using Cox regression if proportional hazards were met, otherwise a logistic regression was employed. To adjust for confounding, age and sex were included in the analysis. Survival was visualized using Kaplan–Meier curves. A p value of < 0.05 was considered of statistical significance.

Results

After randomization, 368 patients received vilobelimab (n = 177) or placebo (n = 191). The median age was 58 years (IQR 47–68) and 252 (68%) were male. All-cause mortality at 28 days was 31% in the vilobelimab group and 40% in the placebo group (HR 0.73 [0.50–1.06], p = 0.094) [1].

Latent class analysis

After excluding correlated variables (hematocrit, neutrophils and red blood cell count, Supplementary Fig. 1), 20 variables were used as input for LCA. Based on multiple indices, a 2-class latent model was deemed most suitable (Supplementary Table 2). In the 2-class LCA model, 82 (22%) patients were assigned to class 1 and 286 (78%) to class 2 (Table 1). Class assignment did not differ between imputation sets (94.6–98.1% agreement). Class 1 was defined by more severely ill patients, reflected by, among other variables, higher creatinine (120 vs. 77 μmol/L, p < 0.001) and bilirubin (12 vs. 8 μmol/L, p < 0.001) and lower systolic blood pressure (113 vs. 120 mmHg, p = 0.003) and PF-ratio (78 vs. 113, p = 0.001). Mortality was significantly higher in class 1 compared to class 2 (28-day mortality 50 vs. 32%, p = 0.003, Table 1). In a logistic regression, since the assumption of proportional hazards were not met, adjusted for age and sex, no HTE between classes was observed for 28-day mortality (p = 0.998, Fig. 1B) or 60-day mortality (p = 0.853). When comparing related Treatment emergent adverse events (TEAE), any or severe, no significant differences were found between the two classes (p = 1.000 and p = 1.000). The 2-class LCA model showed no clear overlap with the SENECA subtypes, except for a higher proportion of patients with the δ-subtype in Class 1 (Supplementary Table 3). The 4-class latent model was also assessed, because of the higher entropy and significant LMR-LRT p-value, but showed no HTE (p = 0.128–0.135) and there was no overlap with the SENECA subtypes (p = 0.411, Supplementary Table 4).

Table 1 Baseline characteristics and outcome of LCA classes and clinical sepsis phenotypes
Fig. 1
figure 1

Heterogeneity of treatment effect using different cluster techniques. A Profile plot of the two classes identified by LCA using clinical data. All variables used are plotted on the x-axis, with the y-axis displaying standardized mean differences. B Kaplan–Meier curves for 28-day mortality of the two classes identified by LCA per treatment group. C Heterogeneity of treatment effect of the previously identified clinical sepsis phenotypes, adjusted for age and sex, with in phenotype δ an increase in effect of vilobelimab compared to placebo. Abbreviations: ALT, alanine transaminase; AST, aspartate aminotransferase; BMI, body mass index; CRP, C-reactive protein; LDH, lactate dehydrogenase; MCV, mean corpuscular volume; PF ratio, PaO2/FiO2 ratio; PT, prothrombin time; SBP, systolic blood pressure; SOC, standard of care; WBC, white blood count

Ward’s hierarchical clustering

Using a different algorithm, HC did not result in significant classes (Supplementary Table 5), therefore no further analyses were executed.

Adjudication of previously identified clinical subtypes

Using the SENECA subtypes, 41 patients (11%) were adjudicated to α, 17 (5%) β, 112 (30%) δ and 198 (54%) subtype γ (Table 1, Supplementary Table 6). In line with previous reports, the δ-subtype was most severely ill, with the highest aspartate aminotransferase (AST) (62, vs α 51, β 35 and γ 39, p < 0.001) and C-reactive protein (CRP) (115, vs α 35, β 109 and γ 107, p < 0.001, Table 1). Both 28-day and 60-day mortality did not differ between the four subtypes (p = 0.485 and p = 0.836). Logistic regression, adjusted for age and sex, detected HTE for 28-day mortality (p = 0.001, Fig. 1C) and 60-day mortality (p = 0.006). Treatment with vilobelimab in the δ subtype was associated with improved 28-day mortality (OR = 0.17 (95% CI 0.07–0.40); p < 0.001) and 60-day mortality (OR 0.21 (0.09–0.48); p < 0.001). Of note, no signal for harm or benefit was seen in treating patients with vilobelimab in any other clinical subtype (p = 0.115–0.790). When comparing related TEAE, any or severe, no significant differences were found between the four subtypes (p = 0.685 and p = 0.796).

Discussion

In this secondary analysis of a phase 3 randomized trial, treatment effect with vilobelimab was consistent across different classes and subtypes in critically ill COVID-19 patients, except for a strong effect in the δ subtype. These data suggest potential benefit for the most severely ill patients, with no signal of greater adverse events from vilobelimab in subtypes of critically ill COVID-19 patients.

Our results extend the pre-specified subgroup analysis in the PANAMO trial [1], stratifying patients based on World Health Organisation (WHO) severity score. Treatment effect was most apparent in most severely ill patients; WHO severity score 7 and the δ subtype. In similar work evaluating HTE in the immunomodulation of COVID-19 patients, blood immune endotypes derived from whole-blood mRNA also had HTE for anakinra [13]. Not all approaches are the same, however, as no HTE was observed in newly developed subphenotypes using LCA or hierarchical clustering in our analysis. The results of this study show that based on clinical variables these patients are quite homogenous. This is also in line with previous results based on plasma biomarkers [14]. Surprisingly, HTE was present in phenotypes derived before the existence of COVID-19. First, this means that it is possible for studies to miss HTE when using their own data, possibly because it doesn’t contain enough information. Second, this highlights that using previous phenotypes can be helpful and that phenotypes can be identifiable in other diseases/syndromes than the original population. Overall, this emphasizes that personalized medicine is important in COVID-19 patients, but cluster analysis and developing new phenotypes is not always necessary in an already quite homogenous group of patients.

This study has several strengths and limitations. First, the use of randomised group allocation eliminates selection bias. Second, by using two different forms of unsupervised learning, our findings are more robust. As for limitations, the sample size in some of the classes and subtypes was too small to make reliable inferences. Second, 13 of the 29 variables needed for the adjudication of the previously identified clinical sepsis subtypes were not available in this cohort; however, the distribution of the subtypes was in line with a previous study applying these subtypes in COVID-19 patients with similar variables [5]. Third, the LCA model did not reach an entropy of 0.8, indicating modest class separation. Fourth, no analysis of functional outcomes or long-term mortality beyond 60 days was done.

In conclusion, treatment effect with vilobelimab was consistent across different classes and phenotypes in critically ill COVID-19 patients, except for the δ subtype, where benefit may be present for the most severely ill patients.