Dear Editor,

In their recent article in Advances in Therapy, Dabbous et al. [1] reported the results of a number needed to treat (NNT) analysis comparing the efficacy of nusinersen (an antisense oligonucleotide therapy) and AVXS-101 (onasemnogene abeparvovec-xioi; a viral vector-mediated gene replacement therapy) in studies conducted in infants with spinal muscular atrophy (SMA) type 1. SMA is a progressive neuromuscular disease with an incidence of approximately 1 in 10,000 live births that results in muscle weakness and atrophy [2,3,4]. Type 1 is its most severe form, with an age of onset of 6 months or less and a median life expectancy of about 2 years without respiratory support [5,6,7]. Nusinersen was approved by the Food and Drug Administration (FDA) in December 2016 for the treatment of SMA in pediatric and adult patients [8]; AVXS-101 was more recently approved by the FDA for the treatment of pediatric patients less than 2 years of age with SMA [9]. In their paper, Dabbous et al. used NNT to compare the efficacy of nusinersen and AVXS-101 on several treatment outcomes using data from separate trials, ENDEAR (NCT02193074) [10] and CL-101 (NCT02122952) [11], respectively. They suggest that AVXS-101 has an efficacy advantage over nusinersen since they found fewer patients would need to be treated with AVXS-101 compared to nusinersen to show clinical benefit. However, as outlined below, several major shortcomings of this study limit the conclusions that can be drawn. We write to express our concern that misleading conclusions may impact treatment decisions made by healthcare professionals, payers, and patients and their families in the treatment of SMA.

Number needed to treat is commonly and appropriately used to describe the benefit of a particular treatment in a specific clinical trial [12, 13]. However, NNT can only give a valid comparison of two treatments in separate trials if the trial populations have similar baseline risk and if the same outcomes are assessed over the same time period. Dabbous et al. conducted an unanchored comparison (i.e., one in which there is no common treatment group) made without adjusting for confounders or differences in study design.

ENDEAR was a multinational, multisite, randomized, double-blind, sham-controlled study (not a case-controlled study as stated in Table 1 of Dabbous et al.) that enrolled 121 patients (80 patients treated with nusinersen) whereas CL-101 was single-site, open-label study that enrolled 15 patients; 12 patients in a high-dose cohort were used in the NNT analysis. Both trials enrolled infants most likely to develop SMA type 1; however, Dabbous et al. themselves point out several differences between CL-101 and ENDEAR in study population and trial design that should be considered when assessing their results and that, taken together, could have a significant effect on the validity of their conclusions. Several of these differences were previously highlighted in a Letter to the Editor of The New England Journal of Medicine describing the challenges of comparing separately conducted clinical trials [14]. For example, it was demonstrated in both ENDEAR [10] and in CL-101 [11] that patients treated at a younger age have significantly better clinical outcomes than those treated at a later age and participants in CL-101 (mean age at first dose, 3.4 months) were younger at first dose than those in ENDEAR (mean age at first dose, 5.4 months). In fact, 10 of 12 CL-101 patients (83.3%) were younger at first dose than the ENDEAR mean age at first dose. Similarly, patients in ENDEAR had a longer mean disease duration (3.6 months) than those in CL-101 (estimated to be 2.0 months) and it was shown in ENDEAR and CL-101 that patients with shorter disease duration generally had better clinical outcomes than those with longer disease duration. In addition, the mean motor function score at baseline (as assessed by CHOP-INTEND) in ENDEAR was lower than that in CL-101, suggesting the ENDEAR patients were weaker at baseline, and 2 of 12 patients in CL-101 had baseline scores above the range expected for symptomatic type 1 patients [15], giving those patients more opportunity to respond.

The difference in outcomes between the two trials was also very likely influenced by the greater disease burden of the nusinersen-treated patients, exemplified by the higher proportion of patients enrolled in ENDEAR who required respiratory support at baseline (26%) compared to those in CL-101 (17%). In addition, patients in CL-101, but not in ENDEAR, who were found to have swallowing difficulties at screening were required to receive surgery for placement of a gastrostomy or nasogastric tube. Therefore, patients in ENDEAR were more likely to have swallowing difficulties resulting in aspiration, possibly affecting study outcomes.

Taken together, we propose that these differences invalidate the use of NNT by Dabbous et al. to compare the effectiveness of nusinersen and AVXS-101 in SMA type 1. A similar conclusion was recently reached by The Institute for Clinical and Economic Review (ICER): “Differences in trial populations related to age at treatment initiation and disease duration limit our ability to adequately distinguish the net health benefit of investigational AVXS-101 versus Spinraza for infantile-onset SMA. We therefore rate the evidence to be insufficient” [16].

It is well established that randomized, head-to-head clinical trials are the gold standard for obtaining estimates of comparative efficacy. These are seldom performed and healthcare decision-makers must often rely on indirect comparisons of clinical trials. Using an unanchored indirect analysis of NNT to compare the efficacy or effectiveness of two therapies will lead to biased conclusions unless appropriate adjustments are made for differences in study design and patient characteristics. A valid comparison of the efficacy of nusinersen and AVXS-101 across trials in SMA patients requires statistical adjustment for patient and study differences. Most importantly, we owe it to healthcare professionals, payers, patients, and caregivers to provide valid, high-quality, unbiased treatment comparisons to enable them to make well-informed treatment decisions.