FormalPara Key Summary Points

Why carry out this study?

Understanding the potential differences in the long-term efficacy and safety of disease-modifying treatments for spinal muscular atrophy (SMA) is important for treatment decision-making.

In the absence of head-to-head trials, indirect treatment comparisons (ITCs) may provide information on the relative efficacy and safety of SMA treatments.

This study conducted a matching-adjusted indirect comparison (MAIC) with assessment schedule matching to evaluate survival, motor function, and safety outcomes of risdiplam (FIREFISH) compared with nusinersen (SHINE-ENDEAR) in children with type 1 SMA over 36 months of treatment.

What was learned from the study?

Results suggested that children treated with risdiplam had improved survival, with a 78% reduction in the rate of death and an 81% reduction in the rate of death or permanent ventilation in children treated with risdiplam compared with those treated with nusinersen. These results also suggested that children treated with risdiplam had greater improvements in motor function, with higher rates of achieving responses on the HINE-2 and the CHOP-INTEND scales (45% and 186% higher rates, respectively) and a longer time to the first serious adverse event compared with children treated with nusinersen.

These findings suggest that children with type 1 SMA treated with risdiplam may see greater improvements compared with children treated with nusinersen over at least 36 months of follow-up.

The results of these analyses support risdiplam as a superior alternative to nusinersen in children with type 1 SMA.

Digital Features

This article is published with digital features, including a video to facilitate understanding of the article. To view digital features for this article, go to https://doi.org/10.6084/m9.figshare.25398454.

Introduction

Spinal muscular atrophy (SMA) is a severe genetic neuromuscular disease characterized by a loss of motor neurons [1]. Motor neuron degeneration in SMA is caused by insufficient levels of the survival of motor neuron (SMN) protein, due to homozygous deletion of or loss of function mutations within the SMN1 gene [2]. Although a paralogous gene, SMN2, produces low levels of SMN protein [3], these are insufficient to compensate for the loss of SMN1, leading to progressive muscle weakness that affects respiratory function, swallowing, and motor function [1]. Without treatment, children with type 1 SMA fail to achieve major motor milestones, and never achieve the ability to sit independently [4]. As the disease rapidly progresses, these children require increasing levels of ventilatory and bulbar/feeding support [1, 5,6,7,8,9]. Without intervention and lacking ventilatory support, children with type 1 SMA typically succumb to the condition before the age of 2 years [10,11,12,13]. In recent years, the approval of disease-modifying treatments (DMTs) and advances in supportive care practices have changed the natural course of the disease in untreated children with type 1 SMA [13,14,15]; children with type 1 SMA are surviving longer and achieving motor milestones never before seen in natural history cohorts [7, 13].

There are currently three DMTs approved for the treatment of type 1 SMA: risdiplam (EVRYSDI®), an oral SMN2 splicing modifier; nusinersen (SPINRAZA®), an intrathecally administered SMN2-targeting antisense oligonucleotide; and onasemnogene abeparvovec (ZOLGENSMA®), an intravenously administered gene therapy [16,17,18,19,20,21]. Although efficacy and safety have been demonstrated in type 1 SMA for these DMTs independently [22,23,24,25,26], there are currently no head-to-head trials comparing available SMA treatments. Indirect treatment comparisons (ITCs), which compare treatments across individual clinical trials, are therefore needed to provide information on the relative efficacy and safety of SMA treatments for healthcare decision-making [27].

Many factors can affect clinical trial outcomes, including baseline characteristics such as age, genetic factors, and disease severity [28], which may differ between trials as a result of variations in recruitment criteria. There may also be differences in the timings of assessments. In ITCs, cross-trial differences may lead to bias if analyses are left unadjusted [27]. Consequently, ITCs require the use of population-adjustment methodologies that account for cross-trial differences in population baseline characteristics to reduce potential bias in relative effect estimates. One such adjustment methodology is matching-adjusted indirect comparison (MAIC) [29,30,31].

Previous ITCs based on 12- and 24-month clinical trial data indicated more favorable results for efficacy and safety outcomes with risdiplam compared with nusinersen in type 1 SMA [32, 33]. MAIC was not feasible in a comparison of risdiplam and nusinersen at 12 months in patients with types 2 and 3 SMA as a result of limited overlap between patient populations [32].

The aim of this study was to conduct an updated ITC comparing treatment outcomes of risdiplam versus nusinersen in type 1 SMA after at least 36 months of follow-up.

Methods

Study Populations and Data Sources

A systematic literature review (SLR) was conducted across several bibliographic databases (Embase®, MEDLINE®, and Cochrane Central) from database inception (prior to 1966) to June 8, 2021, using a previously reported search strategy and inclusion criteria [32]. The SLR was reported according to the Preferred Reporting Items for Systematic reviews and Meta-Analyses (Fig. S1). The identified clinical trials on type 1 SMA are indicated in Table S1. At the time of performing these analyses, no additional clinical trials in patients with type 1 SMA had been published since the original SLR for the ITC conducted at 12 months [32].

Data for the comparison of risdiplam and nusinersen in type 1 SMA were obtained from three clinical trials (FIREFISH, ENDEAR, and SHINE). FIREFISH (NCT02913482) was a phase 2/3, single-arm, open-label, two-part study of risdiplam in children aged 1–7 months at enrollment [34]. FIREFISH Part 1 was a dose-finding study; Part 2 was the confirmatory study using the dose selected in Part 1. FIREFISH Parts 1 and 2 had distinct patient populations; patients in Part 1 did not participate in Part 2. This ITC included data from 17 patients from the high-dose (pivotal dose) cohort from FIREFISH Part 1 and all 41 patients from Part 2. ENDEAR (NCT02193074) was a phase 3, randomized, double-blind, multicenter, sham procedure-controlled study in children aged 1–7 months at enrollment [35]. SHINE (NCT02594124) was an open-label extension study that enrolled patients from ENDEAR (and from CS3A [NCT01839656], CS12 [NCT02052791], CHERISH [NCT02292537], and EMBRACE [NCT02462759]) [36]. Data were extracted only from children in SHINE who participated in ENDEAR.

Individual patient data (IPD) from children treated with risdiplam were obtained from the FIREFISH trial (provided by the sponsor) [34]. Aggregate comparator data from children treated with nusinersen in SHINE-ENDEAR were obtained from the publicly available submission dossier to the Federal Joint Committee (Gemeinsamer Bundesausschuss; the highest decision-making body of physicians, dentists, hospitals, and health insurance funds) in Germany [37]. Pseudo IPD for each time-to-event outcome were generated by digitizing the Kaplan–Meier curves available in the submission dossier [38].

Endpoints

The following survival outcomes were included in the analyses: overall survival, defined as time to death, and event-free survival, defined as time to death or need for permanent ventilation, whichever occurred first. In FIREFISH, permanent ventilation was defined as ≥ 16 h of non-invasive ventilation/day or intubation for > 21 consecutive days in the absence of, or following the resolution of, an acute reversible event or tracheostomy. In SHINE-ENDEAR, permanent ventilation was defined as either tracheostomy or ≥ 16 h of ventilation/day continuously for > 21 days in the absence of an acute reversible event.

Motor function outcomes included in the analyses were time to achievement of a Hammersmith Infant Neurological Examination, Module 2 (HINE-2) response [defined as a ≥ 2-point increase in the ability to kick (or achievement of the maximal score in that category), or a 1-point increase in head control, rolling, sitting, crawling, standing, or walking] and time to achievement of a Children’s Hospital of Philadelphia Infant Test of Neuromuscular Disorders (CHOP-INTEND) response (defined as a ≥ 4-point improvement in CHOP-INTEND score from baseline).

Relative safety was evaluated according to the time of reporting the first serious adverse event (SAE), defined as any adverse event that is fatal, life-threatening, requires or prolongs inpatient hospitalization, results in persistent or significant disability/incapacity, or is a significant medical event in the investigator’s judgment.

Endpoint definitions in FIREFISH and SHINE-ENDEAR are provided in Table S2.

Statistical Analysis Methods

The statistical analysis plan (SAP) is provided in the Supplementary SAP.

Primary Analysis: MAIC

As a result of the lack of a common comparator between FIREFISH and SHINE-ENDEAR, and because IPD were available from FIREFISH, an unanchored MAIC was performed using well-accepted methodology [29, 30, 39].

Unanchored MAICs (referred to as MAIC for the remainder of this paper) attempt to balance prognostic factors (i.e., factors that affect the natural course of a disease) and effect modifiers (i.e., factors that affect the efficacy of a given treatment) to enable unbiased estimation of comparative effectiveness [40]. Prognostic factors and effect modifiers for type 1 SMA were identified from a previous literature review [28]. Matching baseline characteristics were determined on the basis of these identified prognostic factors and effect modifiers, the baseline characteristics available for comparison across FIREFISH and SHINE-ENDEAR, and feedback from medical experts.

Baseline characteristics selected as adjustment factors were mean age at first dose, disease duration at baseline, and baseline CHOP-INTEND score as an indicator of motor function.

Weighting of the IPD from FIREFISH was conducted on the basis of the baseline characteristics of the treated population in ENDEAR (N = 80) [35], as the SHINE cohort included one additional child who was never dosed in ENDEAR, but participated in SHINE (N = 81), and baseline data were not available for this individual. Weights were estimated via a logistic regression model that used the probability of being enrolled into FIREFISH based on the covariates of SHINE-ENDEAR [40], so that after weighting the populations were balanced on adjusted factors. The MAIC was bootstrapped (N = 1000 samples) and the uncertainty in the estimated weights was reflected in 95% confidence intervals (CIs). Outcomes from the FIREFISH data were then recalculated using the estimated weights. Hazard ratios (HRs) were estimated using Cox proportional-hazards models that allowed handling differences in follow-up time across studies and comparisons over time, and adjusted Kaplan–Meier curves were generated. In these comparisons, 95% confidence intervals (CIs) that did not span 1 indicated a statistically significant difference.

Proportional hazard assumption testing was also performed independently for each time-to-event analysis.

Unadjusted comparisons were also conducted.

Scenario Analyses: MAIC Including Additional Adjustment Factors

Analyses were conducted to evaluate the extent to which baseline characteristics other than those included in the primary analysis affected the FIREFISH effective sample size (ESS) and the HR estimates of the outcomes. These scenario analyses included, in addition to mean age at first dose, disease duration, and CHOP-INTEND score, the following adjustment factors (mean values at baseline): ventilatory support, nutritional support, both ventilatory and nutritional support, gender, HINE-2 score, and both gender and HINE-2 score.

Assessment Schedule Matching

In FIREFISH and SHINE-ENDEAR, motor function (HINE-2 and CHOP-INTEND) was evaluated only at scheduled visits. In the context of ITCs, this may lead to bias in favor of the trial with earlier or more frequently scheduled visits, as improvements could be documented earlier. SHINE-ENDEAR had earlier and more frequent HINE-2 assessments than FIREFISH, and FIREFISH had more frequent CHOP-INTEND assessments than SHINE-ENDEAR (Fig. S2).

Differences in the timing of scheduled visits for the assessment of motor function outcomes were adjusted for using assessment schedule matching (ASM) methodology [41]. This enabled the redistribution of IPD from FIREFISH for alignment with the scheduled visits of SHINE-ENDEAR. The adjustment of the FIREFISH visits is described in Fig. S2. Full details of the ASM methodology are provided in the Supplementary SAP.

Curve-fitting analyses were conducted to define the best-fit model of the relationship between scheduled visits for HINE-2 and CHOP-INTEND assessments across the two trials (Fig. S3). Log-normal was selected for use in the HINE-2 and CHOP-INTEND analyses. MAICs in combination with ASM were then conducted to determine the extent to which differences in scheduled visit timings between FIREFISH and SHINE-ENDEAR affected MAIC for motor function assessments.

This methodology was enhanced compared with that of the 12- and 24-month analyses [32, 42] which were conducted only on contemporaneous visits that could be aligned between FIREFISH and SHINE-ENDEAR.

Ethical Approval

This study did not require ethical approval as data were obtained from published studies [23, 24, 37]. FIREFISH and SHINE-ENDEAR received institutional review board approval and were conducted in accordance with the principles of the Declaration of Helsinki. Informed consent was obtained for patient participation in these studies. The data published in this article is owned by F. Hoffmann-La Roche. Roche has the necessary permissions to share with readers upon reasonable request.

Results

IPD from ≥ 36 months of follow-up were available from a pooled dataset of 58 children from FIREFISH who received the pivotal dose of risdiplam (Part 1, n = 17; Part 2, n = 41). Aggregate data from ≥ 43 months of follow-up were available from 81 children in SHINE-ENDEAR who had received nusinersen [37].

Pre-matching Characteristics of FIREFISH and SHINE-ENDEAR Populations

FIREFISH and SHINE-ENDEAR had similar eligibility criteria (Table S3) and thus enrolled patients with relatively comparable disease burden, including pulmonary burden; neither trial would have included patients with invasive ventilation, awake non-invasive ventilation, hypoxemia, or active pulmonary hospitalization/infection at baseline. Pre-matching, the FIREFISH and SHINE-ENDEAR populations were relatively similar in mean age at first dose, disease duration, age at symptom onset, and age at diagnosis (Table 1). However, children in FIREFISH had on average lower baseline HINE-2 (0.9 vs. 1.3) and CHOP-INTEND scores (22.5 vs. 26.6). The need for ventilatory support was slightly higher in FIREFISH (29.3%) compared with SHINE-ENDEAR (26.0%), but the level of nutritional support was almost identical (8.6% and 8.8%, respectively).

Table 1 Comparison of baseline characteristics of FIREFISH and ENDEAR-SHINE both pre- and post-matching

Post-matching Baseline Characteristics

Matching of the FIREFISH IPD to the SHINE-ENDEAR aggregate data was successful. The distribution of rescaled weights is shown in Fig. S4 and Table S4, and the post-matching distributions of rescaled weights for the adjustment factors in FIREFISH is shown in Fig. S5. The analysis of rescaled weights highlighted large weights associated with a small number of children, and small weights (approaching zero) associated with even fewer children, which indicated a favorable overall matching process.

Post-matching, mean age at first dose, disease duration, and CHOP-INTEND score were balanced (Table 1). FIREFISH ESS was 40.6 (30% lower than the population enrolled in FIREFISH [n = 58]).

When additional adjustment factors were included, the post-matching distributions of weights for the adjustment factors in FIREFISH (Table S4) and the FIREFISH ESS (Table 2) remained similar to that of the primary analysis.

Table 2 Baseline characteristics of FIREFISH post-matching with SHINE-ENDEAR: unadjusted, primary, and scenario analyses results

MAIC (Primary Analysis)

Survival and Permanent Ventilation

The number of deaths as well as death or permanent ventilation events are provided in Table S5. MAIC suggested that children treated with risdiplam had a 78% reduction in the rate of death compared with children treated with nusinersen (MAIC HR for overall survival [95% CI], 0.22 [0.04–0.47]; unadjusted HR [95% CI], 0.35 [0.13–0.95]; Fig. 1). Furthermore, results suggested that children treated with risdiplam had an 81% reduction in the rate of death or permanent ventilation compared with those treated with nusinersen (MAIC HR for event-free survival [95% CI], 0.19 [0.07–0.35]; unadjusted HR [95% CI], 0.24 [0.12–0.49]; Fig. 2).

Fig. 1
figure 1

Overall survival in patients with type 1 SMA treated with risdiplam and nusinersen (primary analysis). Unadjusted survival data from the pooled FIREFISH cohort (orange line) were plotted alongside the data from SHINE-ENDEAR (blue line). Patient baseline characteristics in FIREFISH were matched to the mean values of the baseline characteristics of the nusinersen arm in SHINE-ENDEAR using MAIC methodology, thus generating risdiplam-adjusted data (purple line). Characteristics that were used as adjustment factors in the primary analysis were age at first dose, disease duration, and baseline CHOP-INTEND score. PH test: p = 0.893, not rejecting the null hypothesis. CHOP-INTEND Children’s Hospital of Philadelphia Infant Test of Neuromuscular Disorders, MAIC matching-adjusted indirect comparison, PH proportional hazard, SMA spinal muscular atrophy

Fig. 2
figure 2

Event-free survival in patients with type 1 SMA treated with nusinersen and risdiplam (primary analysis). Event-free survival is defined as alive without the need for permanent ventilation. In FIREFISH, permanent ventilation was defined as ≥ 16 h of NIV/day or intubation for > 21 consecutive days in the absence of, or following the resolution of, an acute reversible event or tracheostomy. In SHINE-ENDEAR, permanent ventilation was defined as either tracheostomy or ≥ 16 h of ventilation/day continuously for > 21 days in the absence of an acute reversible event. Unadjusted survival data from the pooled FIREFISH cohort (orange line) were plotted alongside the data from SHINE-ENDEAR (blue line). Patient baseline characteristics in FIREFISH were matched to the mean values of the baseline characteristics of the nusinersen arm in SHINE-ENDEAR using MAIC methodology, thus generating risdiplam-adjusted data (purple line). Characteristics that were used as adjustment factors in the primary analysis were age at first dose, disease duration, and baseline CHOP-INTEND score. PH test: p = 0.415, not rejecting the null hypothesis. CHOP-INTEND Children’s Hospital of Philadelphia Infant Test of Neuromuscular Disorders, MAIC matching-adjusted indirect comparison, NIV non-invasive ventilation, PH proportional hazard, SMA spinal muscular atrophy

Motor Function Outcomes

MAIC suggested that children treated with risdiplam had a 45% higher rate of achieving a HINE-2 motor milestone response (MAIC HR [95% CI], 1.45 [1.21–1.80]; unadjusted HR [95% CI], 1.15 [0.79–1.69]; Fig. 3) and an approximately threefold increase (186% higher) in the rate of achieving a ≥ 4-point improvement on the CHOP-INTEND with risdiplam compared with nusinersen (MAIC HR [95% CI], 2.86 [2.18–4.48]; unadjusted HR [95% CI], 2.80 [1.92–4.07]; Fig. 4).

Fig. 3
figure 3

Time to HINE-2 motor milestone response in patients with type 1 SMA treated with risdiplam and nusinersen (primary analysis). Children were classed as a HINE-2 responder if more motor milestones showed improvement than worsening. Improvement was defined as a ≥ 2-point increase in the ability to kick (or maximal score) or a ≥ 1-point increase in head control, rolling, sitting, crawling, standing, or walking. Worsening was defined as a ≥ 2-point decrease in the ability to kick (or lowest score) or a ≥ 1-point decrease in head control, rolling, sitting, crawling, standing, or walking. Unadjusted survival data from the pooled FIREFISH cohort (orange line) were plotted alongside the data from SHINE-ENDEAR (blue line). Patient baseline characteristics in FIREFISH were matched to the mean values of the baseline characteristics of the nusinersen arm in SHINE-ENDEAR using MAIC methodology, thus generating risdiplam-adjusted data (purple line). Characteristics that were used as adjustment factors in the primary analysis were age at first dose, disease duration, and baseline CHOP-INTEND score. PH test: p = 0.003, rejecting the null hypothesis. CHOP-INTEND Children’s Hospital of Philadelphia Infant Test of Neuromuscular Disorders, HINE-2 Hammersmith Infant Neurological Examination, Module 2, MAIC matching-adjusted indirect comparison PH proportional hazard, SMA spinal muscular atrophy

Fig. 4
figure 4

Time to CHOP-INTEND response in patients with type 1 SMA treated with risdiplam and nusinersen (primary analysis). A CHOP-INTEND response was defined as a ≥ 4-point improvement in CHOP-INTEND score from baseline. Unadjusted survival data from the pooled FIREFISH cohort (orange line) were plotted alongside the data from SHINE-ENDEAR (blue line). Patient baseline characteristics in FIREFISH were matched to the mean values of the baseline characteristics of the nusinersen arm in SHINE-ENDEAR using MAIC methodology, thus generating risdiplam-adjusted data (purple line). Characteristics that were used as adjustment factors in the primary analysis were age at first dose, disease duration, and baseline CHOP-INTEND score. PH test: p = 0.020, rejecting the null hypothesis. CHOP-INTEND Children’s Hospital of Philadelphia Infant Test of Neuromuscular Disorders, MAIC matching-adjusted indirect comparison, PH proportional hazard; SMA spinal muscular atrophy

SAEs

MAIC suggested that children treated with risdiplam had a 57% reduction in the rate of SAEs compared with children treated with nusinersen (MAIC HR [95% CI], 0.43 [0.30–0.59]; unadjusted HR [95% CI], 0.45 [0.31–0.66]); Fig. 5).

Fig. 5
figure 5

Time to first SAE in patients with type 1 SMA treated with nusinersen and risdiplam (primary analysis). Unadjusted survival data from the pooled FIREFISH cohort (orange line) were plotted alongside the data from SHINE-ENDEAR (blue line). Patient baseline characteristics in FIREFISH were matched to the mean values of the baseline characteristics of the nusinersen arm in SHINE-ENDEAR using MAIC methodology, thus generating risdiplam-adjusted data (purple line). Characteristics that were used as adjustment factors in the primary analysis were age at first dose, disease duration, and baseline CHOP-INTEND score. PH test: p = 0.263, not rejecting the null hypothesis. CHOP-INTEND Children’s Hospital of Philadelphia Infant Test of Neuromuscular Disorders, MAIC matching-adjusted indirect comparison, PH proportional hazard, SAE serious adverse event, SMA spinal muscular atrophy

MAIC (Scenario Analyses)

MAIC including additional adjustment factors provided results consistent with those of the primary analysis, with similar FIREFISH ESS and HR estimates across all the scenarios explored (Tables 2 and 3). Alignment with the primary analysis indicated that the post-matching discrepancies in baseline characteristics between FIREFISH and SHINE-ENDEAR may be attributed to the inherent variability in CIs (Table 3).

Table 3 HRs (95% CI) for the probability of each endpoint of interest in FIREFISH pre- and post-matching, including the primary and scenario analyses

Assessment Schedule Matching

Results from MAIC with ASM continued to support the suggestion that children treated with risdiplam had a higher rate of achieving a response on the HINE-2 and CHOP-INTEND compared with children who received nusinersen (Figs. S6 and S7, respectively). Compared with MAIC without ASM, the HR for the achievement of a HINE-2 motor milestone response increased to 1.58 (95% C 1.28–2.06) following MAIC with ASM. The HR for the time to achievement of a ≥ 4-point improvement in CHOP-INTEND score decreased to 2.43 (95% C 1.90–3.61), but CIs overlapped, which was still suggestive of a superiority of risdiplam over nusinersen (p > 0.05).

Discussion

Understanding potential differences in the long-term efficacy and safety of DMTs is important for patients, physicians, healthcare, and reimbursement authorities, and is a key requirement to optimize SMA treatment decision-making. In the absence of head-to-head trial comparisons, this study used MAIC methodology to conduct a balanced comparison of risdiplam and nusinersen in terms of survival, motor function, and the time to experiencing a first SAE. This methodology accounted for differences in baseline characteristics between trial populations.

Outcomes in type 1 SMA were included in this ITC, which compares the longest possible follow-up data that have been published (at least 36 months) from three robust clinical trials (FIREFISH, ENDEAR, and SHINE).

MAIC suggested that children treated with risdiplam had prolonged survival and survival free of permanent ventilation compared with children treated with nusinersen over 36 months, and they had a higher rate of achieving HINE-2 and CHOP-INTEND responses. Children treated with risdiplam had a longer time to experiencing a first SAE when compared with nusinersen. These results were consistent with previous MAICs conducted at 12 and 24 months [32, 33].

Differences between risdiplam and nusinersen in their mode of administration may have contributed to the superiority of risdiplam for survival, motor function, and SAE outcomes suggested by our results. Risdiplam is an oral treatment, which enables systemic distribution throughout the bloodstream [43]. This has been proven to increase levels of functional SMN protein in both the central nervous system and peripheral tissues in animals [43, 44], and it is expected to do the same in humans. In addition, risdiplam crosses the blood–brain barrier [43], which is expected to lead to wide, homogeneous distribution of the drug along the spinal axis, particularly in areas innervating the upper limbs and respiratory muscles. In contrast, nusinersen is intrathecally administered [45], which is expected to lead to uneven drug distribution. Indeed, higher concentrations of nusinersen have been reported in the lumbar and thoracic regions of the spinal cord [46, 47], which may potentially limit the clinical benefits of nusinersen in the upper limbs, and respiratory and bulbar muscles.

In the present study, the FIREFISH and SHINE-ENDEAR populations were effectively matched. MAIC suggested a significant overlap between the two trial populations, with a minor reduction in the FIREFISH ESS.

When selecting adjustment factors, we considered the CHOP-INTEND score to be more relevant than the HINE-2 score to indicate differences in baseline motor function in the very young and severely affected SMA population used in this study. This is because CHOP-INTEND was developed specifically for children with type 1 SMA and is more granular in its scoring than HINE-2 [48, 49]. When the HINE-2 score was included as an additional adjustment factor, the resulting FIREFISH ESS was consistent with primary analysis results, indicating that the CHOP-INTEND score was sufficient to provide baseline assessment of motor function.

Although use of ventilatory (not “permanent ventilation”) and nutritional support are considered to have prognostic/predictive value for treatment outcomes in SMA [28], there were differences in how these were determined or reported across the trials. When these factors were included as additional adjustment factors in scenario analyses, results were consistent with the primary analysis results, indicating that imbalances in these factors did not affect MAIC outcomes.

Gender-related effects on SMA severity have recently been reported [50, 51] but no studies have yet examined the prognostic/predictive value of gender on motor function in type 1 SMA [28]. Although the primary analysis revealed an imbalance in the percentage of female participants, the inclusion of gender as an adjustment factor yielded a FIREFISH ESS consistent with the primary analysis results, indicating that gender-based differences did not affect MAIC outcomes.

Taken together, results of the scenario analyses demonstrated that the mean age at first dose, disease duration, and CHOP-INTEND score were the main factors contributing to differences between the FIREFISH and SHINE-ENDEAR populations at baseline and that these were sufficient for use as adjustment factors in the population adjustment.

Mean age at symptom onset was excluded as an adjustment factor since it is a function of mean age at first dose and disease duration (already included in the matching algorithm). Furthermore, as both FIREFISH and SHINE-ENDEAR recruited only patients with two SMN2 copies, it was not necessary to include copy number as an adjustment factor.

This study improved upon the methodology used in previous ITCs of risdiplam versus nusinersen [32, 33] by combining MAIC with ASM. This reduced potential biases from between-trial differences in the timing of scheduled visits for motor function assessments. ASM demonstrated that our MAIC findings on efficacy and safety were consistent regardless of the differing assessment schedules across FIREFISH and SHINE-ENDEAR.

The results from this study are applicable to patients with type 1 SMA and may not be generalizable across the SMA disease spectrum. Long-term data comparisons of risdiplam against nusinersen in other SMA types have not yet been conducted. Similarly, long-term data comparisons of risdiplam with other SMA treatments are not available to date. To our knowledge, only one other ITC has compared risdiplam with onasemnogene abeparvovec [52]; it reported increased motor function outcomes with onasemnogene abeparvovec relative to risdiplam; however, this was conducted after 8 months of follow-up and without adjustment for known prognostic/predictive factors. In a previous MAIC conducted after 12 months of follow-up [32], differences in study characteristics could not be sufficiently controlled, making MAIC of risdiplam and onasemnogene abeparvovec unfeasible. No new published data were available, so comparison against onasemnogene abeparvovec was not feasible at 36 months.

A recent review by Jiang et al. [53] sought to provide perspectives on MAICs in SMA; we agree with the MAIC best practices highlighted by the authors, and thus we continue to apply the same practices throughout our works. Specifically, in the previous 12-month MAIC [32] and in the present study, comparisons were made for the treated populations in SHINE-ENDEAR and FIREFISH upon evaluation of inclusion/exclusion criteria and baseline characteristic data, demonstrating that both trials enrolled patients of comparable disease burden, including pulmonary burden. Post-matching for documented prognostic/predictive factors, these populations were even more similar.

ITCs and external comparisons are considered useful tools for treatment decision-making and are used by regulatory agencies and reimbursement authorities in SMA and other neuromuscular diseases [27, 40, 54,55,56,57,58,59,60,61]. We acknowledge that ITCs carry their own strengths and limitations and are not a replacement for high-quality, randomized clinical trials. Although Jiang et al. caution against the use of MAIC for drawing conclusions in health technology assessment (HTA) appraisals [53], the method has been used to assess the risk–benefit balance of leukemia treatment for marketing authorization [62], showing that it is accepted among healthcare decision-makers for the evaluation of treatment efficacy and safety.

HTA authorities commented explicitly on the previous 12-month MAIC [32], which was conducted in the same patient population as this 36-month analysis. While considering the limitations of the 12-month MAIC, HTA authorities stated that the study was justified in the absence of head-to-head trials [63], the propensity score matching resulted in reasonably balanced baseline characteristics [64], and the overall MAIC results were acceptable [55]. In this revised analysis, steps have been taken to address previous criticism. Baseline characteristics of children in FIREFISH (risdiplam) were matched to those of children from the nusinersen arm of SHINE-ENDEAR, excluding the best supportive care population. Furthermore, we conducted a series of sensitivity analyses on alternative adjustment factors to test matching stability and compare with the primary analysis results. ASM was also conducted to minimize potential bias from differences in the timing of scheduled assessments.

Study Limitations

Our comparisons were limited by the scope of the publicly available data. IPD were not available for nusinersen, but access to them would have increased the robustness of our comparisons as would the inclusion of additional treatment outcomes (e.g., swallowing, fatigue, or caregiver-reported outcomes). Safety outcomes other than time to first SAE were not considered in this study because of a lack of comparable data across trials.

Whilst our population adjustment was based on known prognostic/predictive factors [28], the choice of adjustment factors was limited by the availability of baseline characteristics reported in both studies. In addition, results may be confounded (in any direction) by unadjusted baseline differences derived from unreported prognostic factors or effect modifiers.

ASM accounted for differences in assessment schedules between FIREFISH and SHINE-ENDEAR; however, this involved simplifications and assumptions [41].

The sample size of the patient populations compared in our study was small. This was especially the case for the FIREFISH sample size, where the pre-matching population of 58 patients was further reduced to an ESS of 40.6 post-matching. Larger sample sizes would have allowed for more robust comparisons; however, it must be noted that these sample sizes are not unusual in clinical trials investigating a rare disease.

There may have been heterogeneity in standard of care (SoC) across clinical sites and over time. However, the two trials had contemporaneous periods: patients treated with nusinersen first enrolled in ENDEAR in August 2014 [65], and migrated into SHINE-ENDEAR in November 2016 [36], with FIREFISH starting in December 2016 [34]. In addition, both trials were conducted globally in many of the same clinical sites [24, 66], with a protocol that encouraged site investigators to meet local/national healthcare considerations, especially those focused on the respiratory, gastrointestinal/nutritional, and physical therapy management of study participants [67,68,69]; specifically, 7/13 countries (Belgium, France, Italy, Japan, Spain, Turkey, and the USA) participated in both SHINE-ENDEAR and FIREFISH. Therefore, any differences we observed in the present study were not likely attributable to differences in SoC.

Conclusions

MAIC analyses suggested that children with type 1 SMA treated with risdiplam had improved survival and greater motor function responses and experienced a longer time to their first SAE compared with children treated with nusinersen. These results may assist patients, physicians, regulatory agencies, and payers when deciding on the optimal SMA treatment for patients with type 1 SMA.

Public disclosure of follow-up data beyond 36 months from FIREFISH and SHINE-ENDEAR would allow longer-term comparisons (including later-onset SMA), thus providing further conclusions on the efficacy and safety of these SMA treatments. Future ITCs would greatly benefit from the inclusion of safety data on risdiplam and nusinersen that are comparable across trials (beyond time to first SAE), enabling a more extensive evaluation of safety outcomes.