Quantitative magnetic resonance imaging measures as biomarkers of disease progression in boys with Duchenne muscular dystrophy: a phase 2 trial of domagrozumab

Duchenne muscular dystrophy (DMD) is a progressive, neuromuscular disorder caused by mutations in the DMD gene that results in a lack of functional dystrophin protein. Herein, we report the use of quantitative magnetic resonance imaging (MRI) measures as biomarkers in the context of a multicenter phase 2, randomized, placebo-controlled clinical trial evaluating the myostatin inhibitor domagrozumab in ambulatory boys with DMD (n = 120 aged 6 to < 16 years). MRI scans of the thigh to measure muscle volume, muscle volume index (MVI), fat fraction, and T2 relaxation time were obtained at baseline and at weeks 17, 33, 49, and 97 as per protocol. These quantitative MRI measurements appeared to be sensitive and objective biomarkers for evaluating disease progression, with significant changes observed in muscle volume, MVI, and T2 mapping measures over time. To further explore the utility of quantitative MRI measures as biomarkers to inform longer term functional changes in this cohort, a regression analysis was performed and demonstrated that muscle volume, MVI, T2 mapping measures, and fat fraction assessment were significantly correlated with longer term changes in four-stair climb times and North Star Ambulatory Assessment functional scores. Finally, less favorable baseline measures of MVI, fat fraction of the muscle bundle, and fat fraction of lean muscle were significant risk factors for loss of ambulation over a 2-year monitoring period. These analyses suggest that MRI can be a valuable tool for use in clinical trials and may help inform future functional changes in DMD. Trial registration: ClinicalTrials.gov identifier, NCT02310763; registered December 2014. Supplementary Information The online version contains supplementary material available at 10.1007/s00415-022-11084-0.


Introduction
Duchenne muscular dystrophy (DMD), the most common type of muscular dystrophy in childhood, has an estimated prevalence of 15.9 cases per 100,000 live male births in the United States [4,5,27]. DMD is an X-linked recessive disorder caused by mutations in the DMD gene and subsequent deficiency in the dystrophin protein. DMD is characterized by progressive muscle weakness and wasting, loss of ambulation, impaired airway clearance/ ventilation, cardiomyopathy, and premature death [21,22]. Current therapeutic options for the amelioration of signs and symptoms of DMD include physical therapy and treatment with corticosteroids. Other disease-modifying treatments that target specific dystrophin mutations and may be suitable in a small subset of eligible subjects include ataluren, which is only available in the European Union, eteplirsen, golodirsen, and casimersen, which are only available in the United States, and viltolarsen which is available in the United States and Japan [9,11,13,18,23,30,34]. Despite the development of new therapeutic options for DMD, the lack of robust biomarkers, the clinical heterogeneity of DMD, and, to a lesser extent, a lack of objective outcome measures that are sensitive to detecting disease progression and treatment effects, remain major challenges in clinical trials and drug development in DMD.

3
The North Star Ambulatory Assessment (NSAA) and functional tests with a time element such as the four-stair climb (4SC) and 6-min walk distance [15,20,28] are commonly used and validated functional assessments. However, functional changes as measured by these assessments develop slowly, which do not permit early detection of treatment-related changes. To address these challenges, several quantitative magnetic resonance imaging (MRI) measures have been proposed as biomarkers of disease progression for use in DMD clinical trials. These include MRI-derived measures of muscle volume, fat fraction, and T2 relaxation time; the latter is known to increase with fatty infiltration, inflammation, and edema [1,2,14,16,26]. Quantitative MRI measures offer the ability to characterize different aspects of the disease process over shorter time intervals, are well tolerated by most subjects without the need for sedation, and provide objective assessment of disease status that does not rely on subject performance.
We recently reported the results from a phase 2, randomized, double-blind trial to evaluate domagrozumab vs. placebo as a potential therapy for DMD [32,33]. Domagrozumab is a humanized recombinant monoclonal immunoglobulin antibody subclass 1 (IgG1) that targets myostatin (GDF-8), a growth factor shown to negatively regulate skeletal muscle mass [3,10]. In the mdx mouse model of DMD, inhibition of myostatin with RK35, a murine antibody equivalent of domagrozumab, led to increased muscle mass and strength, with decreased fat substitution and fibrosis [6,17,31]. The phase 2 study of domagrozumab did not meet its primary efficacy endpoint of mean change from baseline (CFB) in 4SC time at week 49 [32,33]. However, there were favorable effects of domagrozumab vs. placebo on the mean percent change from baseline (%CFB) in thigh muscle volume and in muscle volume index (MVI) as detected by MRI, suggestive of target engagement [32,33]. Here, we present an analysis of thigh MRI parameters collected during the phase 2 trial of domagrozumab and assess the potential for quantitative muscle MRI measures to serve as predictive biomarkers for concomitant functional changes in DMD.

Study design
This was an analysis of data from a phase 2, randomized, two-period (48 weeks each), double-blind, placebo-controlled, multiple ascending dose (5,20, and 40 mg/kg) trial of intravenous domagrozumab in ambulatory boys with DMD (Clinicaltrials.gov: NCT02310763). A detailed description of the study design and inclusion/exclusion criteria has been reported previously [33]. In summary, participants aged 6 to < 16 years with clinically and genetically confirmed DMD, who performed the 4SC in ≥ 2.5 but ≤ 12 s at screening and were receiving a stable dose of corticosteroids for at least 6 months prior to screening, were eligible for the study. Participants were enrolled at 31 sites in 8 countries. Study endpoints included safety and tolerability of domagrozumab and mean CFB in 4SC time and NSAA total score at week 49 for domagrozumab vs. placebo. MRI measures were secondary (muscle volume measures) and exploratory (T2 mapping and fat fraction measures) study objectives.

Site setup and image acquisition
All imaging sites were trained on the study protocol and image acquisition procedures prior to the initial participant visits. Sites were provided with a scanning guide that covered all aspects of participant positioning, image acquisition, and quality controls. An MRI technologist from the central review facility (BioTelemetry Research, Rochester, NY) traveled to imaging centers to ensure consistency in imaging setup across all sites. Imaging facilities were required to submit phantom and healthy volunteer scans to demonstrate proper scanner setup and acquisition technique. Phantom scans using a Uniformity and Linearity (UAL) phantom, or an American College of Radiology (ACR) phantom, were performed to confirm that minimal spatial distortion occurred during the implementation of the MRI scanning protocol. The volunteer scans were inspected centrally to ensure compliance with the acquisition protocols and image quality requirements. After the volunteer test scans were approved the scanner and site MRI technologists were considered qualified to scan study participants. After scanners were qualified, quarterly scans of the UAL or ACR phantom were reviewed centrally to ensure no spatial distortion was introduced during the multi-year study.
Imaging protocols were developed to harmonize image acquisition processes across all imaging facilities using both 1.5 T and 3 T scanners. Scanning sequences and imaging protocols were designed and optimized to enable measurement of whole thigh muscle volume, whole thigh fat fraction imaging via Dixon imaging, and proximal thigh mean T2 relaxation time via T2 mapping. During image acquisition, unsedated participants were placed in a supine position inside the scanner, with the target leg supported off the table surface to avoid compression effects on muscle volume measurements. The same target leg of each participant (typically the right leg) was evaluated at each imaging visit (baseline and weeks 17, 33, 49, and 97). Additional details on the MRI setup and scanning protocol have been described previously [29].
For muscle volume measures, sites were instructed to acquire a proton density weighted axial fast spin echo, turbo spin echo, or spin-echo sequences. Images were acquired covering the thigh area from the acetabulum to the bottom of the patella using 5 mm slices with a 0 mm gap.
For Dixon imaging, sites were instructed to acquire axial 2-or 3-point Dixon scans using manufacturer provided sequences with the inherent body coil to allow acquisition of whole thigh images. Only sites that had manufacturersupplied fat/water imaging sequences were required to submit Dixon scans for analysis. As in the proton density scan, images were acquired covering from the acetabulum to the bottom of the patella using 5-10 mm slices with a 0 mm gap.
For T2 mapping, scan acquisitions started from the top of the lesser trochanter and included ~ 7-10 cm of the proximal thigh. T2 mapping scans were acquired using a body/torso array coil and sites were instructed to acquire axial spinecho scans with ~ 5 echoes. Copper sulfate belt phantoms were included in the field of view to allow quality control assessment of reconstructed T2 maps.
Imaging sites were instructed to use the same approved MRI scanner for all imaging time points for each patient. To account for minor changes in acquisition protocols between scanners, data were evaluated to look at change from baseline for each individual subject.

Image analysis
All images were evaluated at a central facility (BioTelemetry Research, Rochester, NY) as described previously [29]. Upon receipt, all images went through a quality inspection process to ensure images were compliant with the scanning guide, were high-quality acquisitions suitable for assessment (e.g., no significant artifacts), and that T2 phantoms were within the expected T2 relaxation ranges.
Segmentation algorithms based on fuzzy clustering and active contours [8,25,36] were used to initially segment images into bone, muscle bundle, and subcutaneous fat. The muscle bundle, or the cross-sectional region of the thigh excluding bone and subcutaneous fat, was further segmented into lean muscle and inter/intramuscular fat regions on the proton density weighted scans. Following automated segmentation, region of interest determinations were reviewed and adjusted by a study-trained, blinded imaging technologist. After the technologist review, segmented images were reviewed and adjusted by a study-trained, blinded, independent radiologist. The radiologist made the final determination on image readability and maintained responsibility for image segmentation quality and accuracy [8,25,36]. Examples of segmented images across a treatment period are shown in Fig. 1.
In addition to evaluating muscle volume, MVI was calculated. As previously described, MVI measures the fraction of the total thigh that is lean muscle [14] and is calculated as follows: The T2 relaxation time was calculated on a pixel-bypixel basis from the multi-echo T2 mapping scan using a non-linear curve fit, based on a mono-exponential decay model [1,12,29,35]. The initial echo time was not included in the curve fit to minimize the effect of stimulated echoes on T2 calculation [12]. T2 maps were evaluated by both T2 relaxation time and percent of non-elevated voxels. The percent of non-elevated voxels was the percent of total voxels in the T2 mapping acquisition with a relaxation time < 55 ms [29]. Voxels with an elevated T2 relaxation time are more likely to represent fatty or inflamed tissue [1,35]. The 55 ms threshold is elevated above what is observed (1) MVI = (muscle volume ∕ [muscle volume + inter/intramuscular fat volume]) × 100.

Proton Density Weighted Scan
Segmented Image Week 49  For the domagrozumab group: n = 78 for muscle volume, MVI and inter/intramuscular fat volume; n = 75 for mean T2 measures and percent non-elevated voxels; n = 65 for mean fat fraction measures. For the placebo group: n = 40 for muscle volume, MVI and inter/intramuscular fat volume; n = 39 for T2 measures and percent non-elevated voxels; n = 32 for mean fat fraction measures. Age and race information were from screening visit, and weight and height information were from baseline visit CI confidence interval, MRI magnetic resonance imaging, MVI muscle volume index, SD standard deviation a Only subjects with baseline and at least one post-baseline value are included in healthy muscle tissue [1,35], which is consistent with its application here. Water-fat images acquired by the Dixon imaging protocol were used to generate fat fraction maps. The fat fraction map was calculated as follows: Mean values from and T2 maps and fat fraction maps were evaluated in two regions of interest. The first region included the entire muscle bundle, whereas the second was limited to lean muscle.

Statistical analysis
A mixed model for repeated measures analysis with terms for stratification factor, baseline MRI result, treatment, time, and treatment by time interaction as fixed effects, and participants as a random effect, was used to assess the difference in MRI measurements between domagrozumab and placebo groups at week 49. Change and percent change from baseline for each visit was attributed to the last dose received at the previous visit. Baseline was defined as the last pre-dose assessment collected at the screening visit. Unscheduled and early termination readings were excluded.
The relationships between MRI endpoints assessed at week 49 (%CFB in thigh muscle volume, thigh MVI, and inter/intramuscular fat volume; and CFB in mean T2 relaxation time of the muscle bundle, mean T2 relaxation time of the lean muscle, percent non-elevated voxels of the muscle bundle, mean fat fraction of the muscle bundle, and mean fat fraction of lean muscle) and functional endpoints (4SC and NSAA) assessed at week 97 were evaluated using simple linear regression. For these analyses, all participants from each treatment sequence were combined and only those participants with a week 97 functional assessment were included.
Additional analyses were performed to assess the relationship between MRI endpoints at week 49 and functional endpoints at week 97 using regression tree methods [7]. For each pairwise comparison (MRI vs. functional endpoint), a regression tree was constructed to "split" the MRI endpoint into two subgroups that yielded the smallest level of variability on the functional endpoint within each subgroup. Regression trees were also utilized to assess the relationship of both muscle volume and muscle quality together at week 49 vs. each of the functional endpoints at week 97.
Time to loss of ambulation was defined as the number of days on study until the first onset of an adverse event recorded as "Gait Inability." Cox proportional hazards regression models were used to compare the time to loss of ambulation between participants above and below the median baseline value for each MRI parameter. Additional analyses assessed the time to loss of ambulation with a Cox model using each MRI parameter as a time-varying covariate. For all Cox models, age was included as an additional covariate. All P values are presented nominally without an adjustment for multiplicity, and therefore should be interpreted as exploratory analyses.

Participants
Of the 120 participants who were included for analysis in the phase 2 study, all had at least one MRI assessment. In total, 118 boys had scans to measure change in muscle volume and muscle volume index, 114 boys had scans to evaluate T2 mapping changes, and 97 had Dixon scans to evaluate changes in fat fraction. The demographics and MRI characteristics of participants at baseline were generally balanced between the domagrozumab and placebo arms ( Table 1).

T2 mapping measures
There were significant differences between the domagrozumab and placebo groups at weeks 33 Table 2 and Fig. 2f).

Correlative analysis of key MRI endpoints with change in 4SC time and NSAA score
All key MRI endpoints at week 49 were significantly correlated with 4SC time at week 97 ( All key MRI endpoints at week 49 were also significantly correlated with NSAA at week 97 ( Table 4). The %CFB in muscle volume and MVI, and change in percent non-elevated voxels, at week 49, were positively correlated with CFB in NSAA at week 97. By comparison, %CFB in inter/intramuscular fat volume, and CFB in mean T2 lean muscle, mean T2 muscle bundle, mean fat fraction of the muscle bundle, and mean fat fraction of lean muscle were negatively correlated with CFB in NSAA at week 97. Participants who had a large percent change in muscle volume at week 49 (>7.92 %CFB) performed better on NSAA at week 97 (− 1.37 CFB) compared with participants who had a small percent change in muscle volume (leading to − 8.17 CFB on NSAA). Regression tree analyses on all MRI endpoints yielded a difference of at least 5 points on NSAA CFB to week 97 between the two subgroups split by the associated MRI endpoint.

Time to loss of ambulation
Twenty-two participants lost ambulation during the study. All baseline MRI parameters, when stratified by their median value, showed that a less favorable MRI value at baseline was associated with a higher probability of loss of ambulation (Fig. 3). Hazard ratios ranged from 1.5 to 6.3, and MVI, mean fat fraction of muscle bundle, and mean fat fraction of lean muscle were statistically significant (all P < 0.05). Time-varying covariate analyses yielded a statistically significant relationship between MRI over time and time to loss of ambulation, except for mean T2 muscle bundle and inter/intramuscular fat volume.

Bivariate analysis of the relationship between the CFB in MRI parameters at week 49 and the CFB in NSAA at week 97
Participants with a greater percent increase in muscle volume at week 49 (> 7.92%) reported a smaller reduction in NSAA after 97 weeks (− 1.37 CFB) irrespective of change in mean fat fraction of lean muscle at week 49 or change in mean T2 lean muscle (Fig. 4). Participants who had smaller increases in muscle volume (< 7.92%), or those who lost muscle volume over 49 weeks had better functional outcomes if they had preserved muscle quality as indicated by lower change in fat fraction (− 5.42 CFB in NSAA compared with − 10.11) or lower change in mean T2 values (− 6.35 CFB in NSAA compared with − 9.58). Week 17 Week 33 Week 49 Visit (week)

MRI parameters vs. age
To demonstrate the expected change over different age ranges, all MRI parameters were plotted vs. participant age (Supplementary Fig. 1). These plots do not distinguish treatment effects and only showed the change in the participant cohort over the 48-week treatment period. All imaging measures, except for muscle volume, show a declining trend with participant's advancing age.

Discussion
Following 48 weeks of treatment, MRI measures detected significant changes in muscle volume, MVI, and T2 mapping measures in boys treated with domagrozumab vs. placebo. Muscle volume measures demonstrated an anabolic effect of treatment with domagrozumab, consistent with the expected mechanism of action of a myostatin inhibitor and with preclinical studies in the mdx mouse. Muscle volume index, reflecting the percent of total thigh tissue that was lean muscle, also demonstrated a treatment effect with a higher MVI measure in boys treated with domagrozumab for 48 weeks.
T2 mapping measures that evaluated mean T2 relaxation times in lean muscle and the thigh muscle bundle indicated that treatment with domagrozumab reduced the average T2 relaxation time. Lower T2 relaxation times, reflecting decreased fat infiltration, edema, and/or inflammation, suggest that domagrozumab may have helped reduce muscle damage. This finding is further supported by the higher percent non-elevated voxels values observed in boys treated with domagrozumab. As shown previously, the distribution of T2 mapping values spreads with disease progression [16]. The reduction in elevated voxels supports the pharmacodynamic effect observed with domagrozumab treatment.
Overall, domagrozumab appeared to slow muscle degeneration and fatty infiltration as evaluated using T2 mapping and fat fraction analysis, while increasing volume of lean muscle tissue. Although differences in mean CFB of fat fraction measures were not statistically significant for domagrozumab vs. placebo at week 49, there were directionally favorable changes. Differences between treatment groups were detected over the 49-week treatment period; however, dose-dependent differences were not noted following the dose escalation at weeks 17 and 33. Despite muscle MRI measures demonstrating that domagrozumab had an effect on delaying disease progression, the phase 2 trial did not report any statistically significant differences between domagrozumab and placebo in functional changes as evaluated by 4SC (primary efficacy endpoint) or NSAA (secondary efficacy endpoint) at week 49 [32,33]. As a result of the lack of clear functional benefit following treatment with domagrozumab, the trial was subsequently discontinued early.
It has been suggested that subtle changes in muscle may precede functional changes in boys with DMD. This is supported by recent studies demonstrating that MRI measures may be a useful tool to inform longer term functional changes [2,24]. Linear regression analyses were performed to investigate the relationship of MRI changes at week 49 vs. 4SC and NSAA changes at week 97. These analyses demonstrated that thigh muscle volume, MVI, inter/ intramuscular fat volume, T2 mapping measures, and thigh fat fraction measures were significantly correlated with 4SC and NSAA measures after 97 weeks. This finding supports the concept that MRI changes may be observed in advance of functional changes.
In addition to assessing the correlations between week 49 MRI changes and week 97 functional changes, optimal cutpoints, which separated each biomarker into two subgroups, were identified using regression tree methods. The optimal cutpoint provides a threshold for the given imaging biomarker, which maximizes the difference between functional outcomes after 97 weeks. These optimal cutpoints yielded an average difference of 67% between subgroups on 4SC and at least a 5-point difference on NSAA. The regression tree analysis further supports the use of quantitative MRI measures as biomarkers for detecting early treatment effects in DMD.
Despite the small number of participants who experienced loss of ambulation during the study, hazard ratios indicated that participants with a more favorable baseline MRI disposition are less likely to experience loss of ambulation over a 2-year period. The analyses of time to loss of ambulation using MRI parameters as time-varying covariates suggests that unfavorable changes in MRI parameters over time are associated with an increased risk of losing ambulation.
Using bivariate analysis, we conducted a proof-ofconcept analysis to investigate if the relationship between imaging and functional assessments can be strengthened by combining multiple imaging biomarkers, with the CFB in NSAA after 97 weeks used as the measure of function. Muscle volume was combined with either fat fraction of lean muscle or mean T2 of lean muscle to examine sensitivity by combining measures of lean muscle volume and lean muscle quality (as evaluated by T2 or fat fraction measures). The result of this analysis suggests that increases in muscle volume > 7.92% after 49 weeks leads to better preservation of NSAA scores after 97 weeks. For boys who had muscle volume increases of < 7.92%, muscle quality had to be preserved to maintain function. In other words, boys with low muscle volume and poor muscle quality tended to have the most significant declines in physical function as assessed by NSAA after 97 weeks.
To apply MRI biomarkers in clinical trials, it will be important to understand the anticipated change over time within a group of subjects. To this end, we plotted participant age vs. all imaging biomarkers. Although treatment effects were not considered in these plots, the expected change in different biomarkers across age groups can be inferred. Muscle volume appeared to be relatively independent of participants' age. This finding was initially suggested by looking at baseline correlations between participants' age and muscle volume [29], and is further supported by looking at longitudinal measures. This finding reflects two concomitant processes in boys with DMD, namely muscle growth with muscle degeneration and fatty infiltration as the disease progresses. Despite increasing thigh length over 49 weeks, the increasing fatty infiltration and muscle wasting leads to a relatively constant lean muscle volume across the age range studied. MVI and fat fraction measures demonstrated a more "sigmoidal" shape, indicating that MRI measures in boys aged 8-10 years may progress faster than in younger or older boys. This is consistent with previous natural history studies evaluating boys with DMD over multiple years [2,24]. This study demonstrates that quantitative MRI measures can be objective biomarkers that can be included in large, multicenter, international clinical trials. Standardization across MRI equipment at clinical trial sites is feasible and image analysis can be scaled for use in clinical trials. Furthermore, the study demonstrates the successful use of MRI analyses in pediatric participants (6 to < 16 years of age) without the need for sedation, removing a problematic feature when planning MRI assessments as part of pediatric clinical trials [19]. Despite the multicenter design, the young population (mean age < 9 years), and the potential for cognitive impairment and behavioral comorbidities affecting participant cooperation, over 97% of the MRI scans received were evaluable. The most common reason for a scan being non-evaluable was participant motion. The original clinical trial was not specifically designed to demonstrate imaging endpoints as predictive biomarkers, nor was it a prespecified goal of the study design, limiting its generalizability. To truly establish MRI measures as predictors of functional changes in subjects with DMD, additional studies would need to be performed with this explicit goal in mind. The crossover design was particularly limiting in fully establishing quantitative MRI measures as predictive biomarkers, as there could have been slight alterations to functional performance after participants altered their therapeutic course. An additional limitation is that not all participants were followed until their week 97 visit. Although the study was terminated early for lack of efficacy, a few participants may not have had long-term functional assessments due to loss of ambulation or early discontinuation, both of which may introduce some bias into the assessment of week 97 outcomes.
Overall, this study demonstrates that MRI-based biomarkers can detect small changes in muscle volume and quality and can be incorporated into multicenter trials. The exact threshold of change needed to induce a functional benefit is still under investigation; however, these preliminary analyses suggest a relationship between changes in MRI-based biomarkers after 49 weeks and functional changes (NSAA and 4SC) after 97 weeks. These results also suggest that MRI-based biomarkers at baseline can be used to identify participants at higher risk of loss of ambulation over a clinical trial monitoring period.

Conclusions
In DMD, quantitative MRI measures can be viable biomarkers to help inform clinical trials and have the potential to predict future functional changes. The standardized acquisition methods used were scalable in a multicenter international study and may guide future clinical trials to enable the detection of subtle changes in muscle. Despite the imaging results reported in this analysis, at the time of primary study completion, the totality of evidence did not support clear clinical benefit with domagrozumab in DMD.

Ethics approval
The original phase 2 trial was conducted in accordance with legal and regulatory requirements, as well as the general principles set forth in the International Ethical Guidelines for Biomedical Research Involving Human Subjects, guidelines for Good Clinical Practice, and the Declaration of Helsinki. The protocols, any amendments, and informed consent/assent documents were approved by the institutional review board or ethics committee at each study center.
Consent to participate A parent or legal guardian provided written informed consent prior to any study-specific activity that was performed.

Consent to publish Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.