Introduction

Gait abnormalities are a hallmark of multiple sclerosis (MS) and a major cause of disability and reduced health-related quality of life. People with multiple sclerosis (pwMS) walk more slowly than age-matched controls, with a lower cadence and increased stride-to-stride variability [1, 2]. Intervention trials have shown that treadmill-training improves gait speed, endurance, and balance in pwMS [3, 4]. However, MS difficulties extend beyond these motor problems. Cognitive dysfunction affects 40–70% of pwMS [5] and frequently leads to impaired information processing speed and deficits in verbal and visual memory [6, 7]. The cognitive deficits also negatively impact many aspects of quality of life [8] and walking during everyday activities [9], especially during challenging conditions like dual-tasking [10, 11]. These deficits in gait, cognitive function, and dual-tasking are associated with an increased risk of falls [12, 13] and disability [14, 15] and suggest the need for combined motor-cognitive rehabilitation approaches that address the multiple symptoms and challenges that are faced by pwMS [16].

To meet this need, several recent studies among pwMS combined motor and cognitive training into one treatment program [17, 18]. Progressive, integrated dual-tasking training led to significantly greater improvement in dual-tasking gait compared to single-task training [17]. Exercises delivered via virtual reality (VR) and conventional balance exercises improved balance and mobility in pwMS, while VR-based training also improved cognitive–motor performance [18]. One advantage of VR technology is that it can provide multidimensional, personalized rehabilitation while mimicking real-world environments in a controlled, game-like manner. Several studies reported that VR-based training improves mobility in older adults and people with neurological impairments [19,20,21,22]. Two systematic reviews also reported the potential of VR-based training to improve impaired cognitive domains and reduce depressive symptoms and anxiety in healthy older populations and patients with neurological impairment [23, 24]. Furthermore, in a small, open-label pilot study among pwMS, a 6-week VR-based treadmill training intervention enhanced dual-task walking, and the effects were maintained 1 month later [25]. These studies suggest that adding VR-based challenges that incorporate cognitive as well as motor demands to treadmill training may enhance dual-task walking abilities and certain aspects of cognitive function. Nonetheless, among pwMS, evidence from well-powered randomized controlled trials regarding the added value of treadmill training with VR, compared to treadmill training alone, is lacking [16], and there is a need for evidence-based, clinically feasible interventions to treat cognitive deficits [7].

The goal of the present work was to address this gap. We identified three specific aims: (1) To assess the benefits of using treadmill-training, augmented with virtual reality (TT + VR) on key MS symptoms, specifically dual-tasking gait speed and cognitive processing speed, i.e., the two primary outcomes, compared to treadmill training (TT) alone; (2) to evaluate the transfer of the training effects to non-trained tasks and other MS symptoms (e.g., depressive symptoms); and (3) to assess retention effects of using TT + VR, compared to TT alone, at 3 months post-training.

Methods

Study design

This multi-site study was a single-blinded, 2-arm RCT that took place at four clinical sites: Tel Aviv Sourasky Medical Center (TASMC), the University of Illinois at Urbana-Champaign (UIUC), University of Kansas Medical Center (UKMC), and NeuroCure Clinical Research Center (NCRC), Charité-Universitätsmedizin in Berlin. The study was registered at clinicaltrials.gov (NCT02427997). The first patient entered the study on August 5, 2016, enrollment finished on January 15, 2021, and the last patient was assessed on June 4, 2021; the decision to stop the study was made after weighing COVID-19 and funding considerations and becoming sufficiently close to the proposed sample size. The study protocol, hypotheses, sample size justification (the original target was to enroll 144 participants), and methods were published previously [26]. Here, we briefly summarize key elements.

Participants

The key criteria for inclusion were: (a) physician confirmed diagnosis of relapsing–remitting MS (RRMS) following McDonnald’s criteria, (b) relapse-free in the last 30 days, c) between 20 and 65 years of age, (d) gait limitations due to MS (based on the first question of the Multiple Sclerosis Walking Scale), and (e) score between 2.0 and 6.0 on the Expanded Disability Status Scale (EDSS). Key exclusion criteria included: (a) inability to walk unassisted (excluding cane) for 5 min or follow safety or training instructions, (b) another neurological disorder, (c) any cardiovascular, orthopedic, or other problems that may interfere with walking, or diagnosed active psychiatric problems (e.g., severe depression), and (d) currently participating in an intensive exercise program.

Procedures

After providing informed written consent, subjects underwent a baseline assessment. Then, participants were randomized to either a treadmill training (TT) alone group (active-control comparison group) or treadmill training with VR (TT + VR; i.e., the experimental) group; training took place at one of four clinical sites, as mentioned above. The subjects were informed that the purpose of the study was to compare the impact of two forms of treadmill training on symptoms; in that sense, they were blinded to the specific study hypotheses, and this can, to some degree, be considered a double-blind study.

Randomization was performed by center and by gender to ensure gender balance across the two intervention and control arms in blocks of four with the goal of 1:1 allocation to the two study arms. The randomization sequence was generated in Matlab® by EG and was only shared with the trainers on a rolling, as-needed basis. The trainers accessed the randomization file one subject at a time; trainers or study coordinators informed the participants about the intervention arm. The assessors did not have access to this file.

All randomized subjects received 13–18 (maximum) sessions (3/week × 6 weeks) of training. The VR system and the training protocol were previously described (GaitBetter, Ramat Gan, Israel) [21, 26] and are also detailed in the protocol paper [26]. All study sites used the same software, and the treadmill, computer screen, and safety harness were similar at all sites, as described previously) [21, 26]. Briefly, the treadmill speed was set based on the participant’s over-ground walking speed at baseline, and then, it was measured weekly. The treadmill speed and the session duration increased gradually, and hand support decreased. This was similar in both groups. Subjects in the TT + VR group navigated through a virtual environment projected on a TV screen while walking on the treadmill and receiving feedback from the system in real-time (Fig. 1). Motor and cognitive challenges in both groups were individualized to the participant’s level of performance. Training protocols within each group were specified in advance. After the 6-week intervention, participants returned within 10 days for a post-assessment identical to the baseline assessment and completed an identical follow-up assessment 3 months later. To isolate the effects of the interventions, participants were instructed not to start any new activity or receive additional therapy during the training period.

Fig. 1
figure 1

The training setup illustration. a The study training arms. Treadmill training alone (TT) on the left vs. TT with the addition of virtual reality (TT + VR) on the right. All participants were wearing a harness during the training. The VR is projected on the TV screen in front of the participant in the TT + VR group. The markers are attached to each foot captured by the camera, which is located below the TV screen. b Visualization of the VR environment include in movement of the participant’s virtual feet as they negotiate virtual obstacles with positive (green) and negative (red) feedback. The VR setup is similar to that described in the V-Time study (Mirelman et al. [20])

Outcome measures

All outcome measures were evaluated by treatment-blinded assessors who were experts in collecting mobility and cognitive data in pwMS. The two primary outcome measures were: (1) dual-tasking gait speed, as measured when the subject walked while performing a word list generation task; (2) scores on the Symbol Digit Modalities Test (SDMT), a widely used measure that reflects information processing speed and attention [27] as evaluated after completion of the 6 weeks of the intervention. The number of correctly matched symbols within 90 s was the outcome score for the SDMT. A change of 4 points is considered clinically meaningful among pwMS [28].

Secondary outcome measures included functional gait tests, cognitive function, daily-living mobility, and patient-reported questionnaires. Functional tests included the Timed 25-Foot Walk (T25FW) and the 6-Minute Walk (6 MW) test [29]. Gait under usual-walking (UW) and dual-task walking (DTW) conditions were quantified, while participants were walking back and forth over a 12-m walkway for 1 min while wearing sensors (Opals, APDM, Portland, OR) placed on each wrist, on the lateral malleolus of each leg and the lower back. Only straight-line segments were studied. Extracted parameters included gait speed, cadence, and step regularity [30]. Step regularity reflects the regularity (and consistency) of the stepping pattern between consecutive steps. Low step regularity indicates that there is low regularity between steps or asymmetry between the left and the right leg (higher values indicate greater symmetry). To evaluate the impacts of TT and TT + VR training on daily-living physical function and daily-living gait, a small, light-weight, water-proof tri-axial accelerometer (Axivity AX3, York, UK; 23.0 × 32.5 × 7.6 mm; weight: 11 g; 100 Hz sampling rate) was worn on the back (at the level of the fifth lumbar vertebrae) for 7 days at each assessment time point [31]. Participants were asked to wear the device continuously for 1 week and to continue their daily activities as usual. Upon completion of the 1-week recording, participants removed the device and sent it back to the local clinical site. Data were included in the analysis if the recording was longer than three days. For gait quality, walking bouts that were at least 30 s long were evaluated. The extracted measures included the amount of walking (step counts per day), total daily-living physical activity (sum of signal vector magnitude, SVM, during the day), and three measures (cadence, gait speed, and regularity) that reflect the quality of daily-living gait (i.e., “typical”, the subject’s median value across the week) and “best” (i.e., the subjects 90% value across the week) [31].

A standard battery of neuropsychological tests, the Brief International Cognitive Assessment for Multiple Sclerosis (BICAMS), comprising the California Verbal Learning Test-II (CVLT-II), the revised Brief Visuospatial Memory Test (BVMTR), and the Word List Generation (WLG) tests, evaluated cognitive outcomes [27]. The Trail Making Test (TMT) A and B evaluated attention and executive function, respectively [32].

Patient-reported outcomes included: (a) The Multiple Sclerosis Walking Scale-12 (MSWS-12), a reflection of perceived walking impairment [33]; (b) Subscales of quality of life (e.g., physical function, emotional well-being) and overall quality of life via the MS QOL-54 (MSQOL54) [34]; and (c) The Patient Health Questionnaire-9 (PHQ-9), a measure of depressive symptoms [35, 36].

Statistical analyses

Demographic and clinical characteristics were summarized as mean and standard deviation (SD) or median and interquartile range (IQR) for continuous variables and as frequencies (percentage) for categorical variables. The Kolmogorov–Smirnov/Shapiro–Wilk test was used to test for normal distributions. Differences in demographic and clinical characteristics among the groups were evaluated using t tests and General Linear Models (GLM) with adjustment for age and gender, or Mann–Whitney tests, and Pearson’s Chi-square for dichotomous variables (i.e., gender). Significant outliers (median ≥ 3SDs) were removed before the repeated-measures procedure. The effects of time (Baseline versus Post-training assessment and baseline versus 3-months follow-up) and treatment were evaluated using repeated-measures (RM) ANOVA and Wilcoxon test for repeated measures (for categorical and bi-nominal variables). We also examined the role of the study site (center) as a covariate (TASMC n = 49; UIUC = 29; UKMC = 17; NCRC = 9); it was not significant in any of the analyses, and hence, the results are reported without adjustment of this covariate. The level of significance was set at p ≤ 0.05. Post hoc comparisons with Bonferroni corrections were performed for within-group pairwise comparisons (adjusting for the number of tests within a domain). Mean differences and 95% confidence intervals (CI) are also reported for normally distributed outcome measures. Partial eta-squared (η2) is used to estimate effect sizes (≥ 0.0.06 and 0.14 is considered medium and large, respectively). Statistical analyses were performed using SPSS (Version 27.0 for Windows, SPSS Inc., Chicago, IL). For the primary outcomes, we report the results based on a “per protocol” analysis, using the complete case analysis (CCA) method (i.e., subjects with pre-post assessments) and a modified intention-to-treat analysis (using an imputation approach that replaces missing post-training assessment values with an individual’s baseline assessment values and assuming no training effect for dropouts). For the other outcomes, results are reported without imputation using the available data. In addition, we conducted a sensitivity analysis of the baseline values of those who finished the intervention period and completed the post-assessment and dropouts using t tests.

Results

As summarized in Fig. 2, 139 participants were enrolled in the study, 124 participants were randomly assigned to one of the training groups, and 108 completed the training program and the post-training assessment (52 in the TT and 56 in the TT + VR groups). Eighty-four participants completed the 3-month follow-up assessment. Four more subjects were excluded from the analysis. Reasons for exclusions and dropouts are described in Fig. 2. Thus, per-protocol analysis was based on 104 individuals who completed the 6-week intervention and participated in the before and after assessments. These subjects were 49.0 ± 9.8 years old, with a median EDSS score of 3.5 (2–6), and 72.2% were female. Participants who were allocated into the two training groups had similar age, gender, height, weight, and EDSS at baseline (Table 1), and also showed similar performance in the primary outcomes at baseline: SDMT scores (p = 0.751) and dual-tasking gait speed (p = 0.616) (see also below). Forty-eight percent of the subjects completed all 18 sessions (53% in the TT and 44% in the TT + VR groups). The number of training sessions that were completed was similar (p = 0.771) in the TT and TT + VR groups. The median number of sessions in those trained with TT was 18 (range: 13–18), and in those trained with TT + VR, it was 17 (range: 15–18).

Fig. 2
figure 2

Study CONSORT Flow

Table 1 Subject characteristics in the two training groups at baseline

Training effects on the primary outcomes

In per-protocol analyses, SDMT scores improved in both training groups (time effect: p = 0.004, η2 = 0.108) after the intervention. A timeXtreatment-group interaction (p = 0.027, η2 = 0.051) revealed that subjects in the TT + VR group improved more than those in the TT. Within-group analysis showed that the participants who trained with TT improved their SDMT score by 0.8 points (95% CI 0.9–2.5 points, p = 0.358, η2 = 0.037). In contrast, subjects who trained with TT + VR improved by 4.4 points (CI 1.9–6.8, p = 0.001, η2 = 0.105) (Fig. 3). Three months post-training, SDMT values were 3 points higher than the baseline values in the TT + VR group (p = 0.005) and 1.8 points higher than baseline values in the TT group (p = 0.06).

Fig. 3
figure 3

The primary outcomes at baseline, post-training, and 3-month follow-up. a SDMT-cognitive processing speed; b gait speed walk under dual-task condition. *Significant main time effect (P < 0.05). # time X training significant interaction. $Within-group significant time effect for TT + VR group only

Dual-tasking gait speed increased (time effect: p = 0.001, η2 = 0.219) in both groups. It increased by 10.4% (10.7 cm/s, CI 4.5–17.0 cm/s, p = 0.001, η2 = 0.219) in the TT group and by 8.7% (10.5 cm/s, CI 4.4–16.7 cm/s, p = 0.001, η2 = 0.227) in the TT + VR group. The time X treatment-group interaction effect was not significant (p = 0.670), indicating that the improvements were similar in the two training groups. Dual-tasking gait speed remained higher than baseline 3 months after the intervention (time effect: p = 0.008) (Fig. 3). The within-group effect at this time point was only significant in the TT + VR group (8.4 cm/s; CI 7.0–10.0 cm/s; p = 0.035) (Table 6).

In modified intention-to-treat analyses, the results for dual-tasking gait speed and SDMT were similar to those for the per-protocol analyses described above. For dual-tasking gait speed, a significant time effect was observed (8.0 cm/s, CI 4.7–11.3 cm/s, p < 0.001), regardless of intervention (time X treatment-group interaction effect p = 0.956). For the SDMT scores, a time effect was observed (p = 0.007), and, as before, a time X treatment-group (p = 0.010) interaction effect was also seen. The TT group improved by 0.7 points (CI 0.7–2.0 points, p = 0.358), and the TT + VR group improved by 4.0 points (CI 1.8–6.3 points, p = 0.001).

Secondary outcomes

Gait

Other aspects of gait significantly improved after training in both groups (Table 2). For example, usual-walking gait speed significantly increased in both groups following the intervention (p < 0.001) (recall Table 2). This improvement persisted 3 months later. Similarly, cadence and dual-tasking step regularity increased post-training (time effect: p < 0.001). Within-group analysis showed that dual-tasking step regularity improved only in the TT group (adjusted), while usual and dual-tasking cadence improved in the TT + VR group (adjusted). However, these improvements did not persist at the 3-month follow-up. Additional gait and cognitive outcome measures at the 3-month follow-up are presented in Table 6.

Table 2 Gait before and after training in the TT and TT + VR groups

Functional tests of mobility

Time to complete the T25FW decreased by 2.9% (95% CI − 7.6 to − 1.7%) in the TT group and by 7.7% (95% CI − 11.5 to − 3.8%) in the TT + VR group following the interventions (Fig. 4). This time effect was statistically significant in the TT + VR group (p < 0.001) but not in the TT group (p = 0.176). Endurance, i.e., the distance walked in 6 min, increased in both groups (time effect: p = 0.005), with no significant interaction effect of time X treatment group (p = 0.620). However, a significant within-group-specific time effect was found in the TT group (20.7 m, 95% CI 3.2–38.2 m, p = 0.021), and only a trend in the TT + VR group was seen (14.4 m, 95% CI 2.5–31.4 m, p = 0.093) (Fig. 4).

Fig. 4
figure 4

Functional tests at baseline and post-training. a 25 foot walk at fast speed; b maximal distance covered in 6 min. *Pre–post-significance in TT + VR group according to Wilcoxon repeated signed test. #pairwise significant time effect according to RM ANOVA

Cognitive function

Both groups showed significant improvements on the CVLT tests. The time to complete TMT A was significantly better, as was world list generation and the number of words generated during the dual-task gait test in the TT + VR group, while those who trained with TT only showed only trends for these cognitive outcomes (Table 3). Moreover, several cognitive improvements were preserved after 3 months in the TT + VR group (see also Table 6).

Table 3 Secondary cognitive outcomes before and after training in the TT and TT+VR groups

Patient-reported outcomes

Compared to baseline values, scores on self-report of physical quality of life (MSQOL-54) were higher after training (Table 4) in both groups, with a stronger trend seen in the TT group (p = 0.031 unadjusted). TT + VR subjects reported improvements (p = 0.010) in mental components of quality of life (MSQOL-54) and fewer (p = 0.003) depressive symptoms (PHQ-9); these changes were not seen in the TT group (p > 0.25). Conversely, patients who trained with TT reported fewer walking problems according to MS Walking Scale-12 scores (p = 0.017 unadjusted), while those who trained with TT + VR did not (see Table 4). After the adjustment for multiple comparisons, these TT group advantages did not persist.

Table 4 Patient-reported outcomes: quality of life, depression, and mobility disability based on self-report at baseline and post-training in the TT and TT+VR groups

Daily-living physical activity

Total daily-living physical activity did not change in either group (time effect: p = 0.520; timeX treatment-group effect: p = 0.124). Step count also did not change after training (time effect: p = 0.326) (Table 5). Conversely, certain aspects of the quality of daily-living walking became higher, reflecting an improvement after training. “Best” daily-living gait speed improved (time effect p = 0.032) in both groups (timeX treatment-group interaction effect: p = 0.890), as did usual (time effect p = 0.012) and best daily-living step regularity (time effect: p = 0.002). This change was more pronounced in the TT + VR group (p = 0.016) than in the TT group (p = 0.066).

Table 5 Metrics of daily-living activity and daily-living gait at baseline and post-training

Training effects as a function of disease stage

To evaluate the role of disease severity, in exploratory analysis, we divided the subjects into those with mild-moderate disease (EDSS < 4) and those with more advanced disease (EDSS ≥ 4). The time X EDSS group X treatment-group interaction was significant (p = 0.023) for SDMT scores. When exploring each EDSS group separately (adjusted to age and gender), the time effect on SDMT scores persisted within both the lower EDSS (p = 0.046) and the higher EDSS (p = 0.009) subgroups with an advantage to the TT + VR training. Among those with lower EDSS, in the TT group, SDMT scores increased by 0.5 points (95% CI − 2.3 to 3.3) and by 4.1 points (95% CI 1.3–6.9) in the TT + VR group. Among those with higher EDSS, in the TT group, SDMT score increased by 1.3 points (95% CI − 1.9 to 4.5) versus 4.6 points (95% CI 1.5–7.6) in the TT + VR group. For dual-tasking gait speed, a significant time effect was also observed among both low (p = 0.005) and high (p < 0.001) EDSS levels regardless of training group (interaction effect: p = 0.350); In the lower EDSS level, dual-tasking gait speed increased by 12.4 cm/s (95% CI 4.0–20.0 cm/s) in the TT group, and by 8.1 cm/s (95% CI − 1.4 to 17.7 cm/s) in the TT + VR group. In the higher EDSS level, dual-task gait speed increased by 7.0 cm/s (95% CI − 2.5 to 16.5 cm/s) in the TT group and by 9.7 cm/s (95% CI 0.7–18.6 cm/s) in the TT + VR group.

Dropouts and loss to follow up

Fourteen subjects who were assessed at baseline were not assessed after the interventions (recall Fig. 2). The subjects who were and were not evaluated post-training were similar in the two primary outcomes at baseline. No significant differences were found in SDMT scores (in the TT group, p = 0.742 and the TT + VR group, p = 0.368) or in dual-tasking walking speed (in the TT group, p = 0.367 and the TT + VR group, p = 0.634). This suggests that there were no systematic differences among those who were and were not assessed after the intervention and that the data may be considered missing at random (MAR).

Retention effects

The effects of the interventions at the follow-up (secondary time point) are summarized in Table 6. As described above, several benefits persisted at this time point, while others were weakened or vanished.

Table 6 Study outcomes at baseline and at the 3-month follow-up in both groups

Discussion

This RCT evaluated the impact of cognitive–motor virtual reality training on cognitive processing speed, dual-tasking gait speed, and other aspects of mobility, cognition, patient-reported outcomes, real world, daily-living activity, and quality of life, and compared its effects to those of an active-control intervention: conventional treadmill training alone. The impact on cognitive processing speed was greater after TT + VR training than after TT alone, consistent with our hypothesis. In contrast, dual-tasking gait speed improved similarly after both TT and TT + VR. We observed specific benefits to TT alone (MSWS-12 and MSQOL physical functioning) and multiple benefits specifically to TT + VR for several aspects of cognitive function, quality of life, and depressive symptoms, in addition to improvements in the SDMT. This positive impact on cognitive function addresses an important, previously identified gap regarding treating cognitive deficits in pwMS [7].

In contrast to an earlier RCT that reported no improvement in SDMT scores after 12 weeks of stepping training (in place) among 44 pwMS [37], cognitive processing speed became faster in both training groups following the training. The slight improvement (< 1 point) that was observed in the TT group could be a result of a practice effect [38]. Conversely, the TT + VR subjects showed clinically meaningful improvements [28] (mean increase of 4.4 points). Improved attention and verbal fluency (better scores on TMT A and the WLG test), cognitive functions critical for everyday life, also improved specifically in the TT + VR group, consistent with the SDMT findings. These improvements emphasize the added value of augmenting VR to treadmill training to enhance cognitive function in pwMS. In general, these findings extend previous work, which recommends VR to target cognitive function among older adults and patients with neurological impairments [19, 22, 23]; here, we provide novel RCT evidence among pwMS.

Several aspects of gait became larger (better) following the intervention, under usual and dual-task conditions, some after TT, some after TT + VR, and some after both treatments (e.g., gait speed). These findings are consistent with previous studies that used treadmill-based rehabilitation in pwMS with moderate EDSS [3, 25]. Similar to those previous reports [3, 25], in our study, gait speed improved by about 10% following treadmill training, and dual-tasking gait speed increased by about 10 cm/s, often considered the minimum clinically important difference for gait speed [39, 40]. However, in contrast to our expectations, the addition of VR to the treadmill training did not further enhance dual-tasking gait speed, beyond that seen with TT alone. Perhaps, subjects in the TT group spent less cognitive effort focusing on the motor task after the intervention, allowing for more cognitive resources to be available to attend to the motor task. Regardless of the exact mechanism, this finding is similar to that of a recent RCT in 39 pwMS, which showed that both conventional balance training and balance training with VR improved balance and mobility in pwMS, although each showed unique advantages [18]. However, to the best of our knowledge, our study is the first and the largest RCT study in pwMS to examine the impact of adding VR to treadmill training on cognitive functioning and gait measured over-ground in a laboratory setting and during daily-living.

Following the intervention, subjects from both groups showed improvement in several aspects of daily-living mobility measures, e.g., “best” gait speed and daily-living step regularity [41]. These findings suggest that the treadmill training benefits transferred to certain aspects of over-ground walking in real-world, everyday settings. Nonetheless, the interventions did not change the subject’s “typical” values of these measures during daily-living or daily-living step counts—perhaps because they are influenced by other behavioral and environmental factors, while the “top” performance during daily-living did improve. This “best” performance may reflect capacity, suggesting that this ability improved, consistent with the in-lab results. TT + VR tended to have a better effect on step regularity (greater symmetry) during daily-living. This might be a reflection of the multi-modal intervention. Treadmill training with VR can positively impact certain aspects of daily-living ambulation, dual-tasking, and cognitive function. Thus, in the future, it would be interesting to examine if this intervention reduces the risk of falls, a common problem among pwMS that has been related to cognitive and motor deficits, and disability [12,13,14,15]. In addition, the results suggest that to change everyday behavior and increase daily-living physical activity, treadmill training should, perhaps, be combined with some form of behavioral intervention that promotes physical activity.

Three months after training, SDMT scores tended to be higher than the baseline values in both groups (recall Fig. 3). However, only the TT + VR group maintained the improvement in most of the cognitive tests. These findings align with recent studies showing the potential of VR to promote cognitive rehabilitation among older adults [42] and executive and visual–spatial abilities in pwMS [43]. Furthermore, previous studies that used motor–cognitive training in pwMS reported improvement in functional tests [44]. Consistent with that, we observed greater improvement in the functional short, fast walk test (T25FW) in the combined training group (TT + VR), which was retained after the follow-up period, apparently slowing the reported natural annual deterioration of 1.8 s on the T25FW [45], at least for 3 months. Yet, the improvements achieved in the T25FW and the 6MWT cannot be considered clinically meaningful [46], suggesting perhaps that treadmill training improved gait quality, but that other approaches are needed to significantly impact cardiovascular function and endurance.

Consistent with other work [47], treadmill training alone—but not TT + VR—slightly improved self-perceptions of mobility (MSWS-12). Perhaps, focusing on the gait pattern (without the VR interference) translated to a better physical self-perception. Nonetheless, the TT + VR group showed improvements in the mental component of QOL and a substantial reduction in depressive symptoms; these changes were not seen after TT alone. These positive effects of VR training on mood and depression are supported by previous studies in older adults [42], in people with neurological disorders (e.g., stroke) [48], and, recently, in pwMS [43]. Our results underscore the links between physical function, cognition, and mental aspects of quality of life.

Interestingly, both subjects with mild and more advanced disease benefited from the training. This question should be further explored in larger cohorts of pwMS with more advanced disease. The disparate effects of TT and TT + VR on SDMT suggest that the improvement in cognitive processing speed among those who trained with the TT + VR was not simply related to dual-task gait improvements. This is parallel to cross-sectional findings [49]. In that regard, it would be interesting to perform mediation analyses to examine the relationships between improvements in SDMT, gait, and other outcomes like mental health.

This study has several limitations. For example, we attempted to match the training in both groups, except for the VR addition; however, during the TT sessions, the participants in the TT group had more time to concentrate on their gait in the absence of the VR component. This may have masked some of the VR advantages. The emergence of the COVID-19 pandemic made recruitment and training challenging, causing dropouts and limited retention for follow-up. Moreover, the pandemic likely affected daily-living activity and may have reduced the effects of the interventions, especially on real-world ambulation. Hence, the impact of the treadmill training with and without VR addition on daily-living gait in pwMS should be further explored. In the future, it may also be informative to evaluate the effects of treadmill training with and without VR on brain structure and function and other markers of disease burden in pwMS and to assess whether the impact of TT and TT + VR on dual-tasking gait speed depends on the specifics of the dual-task [49]. The present work sets the stage for these future studies.

In summary, to the best of our knowledge, this is the largest RCT among pwMS to demonstrate that treadmill training with or without the addition of a VR component positively affects several aspects of gait and mobility. The results of this RCT support the idea that treadmill training, with or without VR, should be recommended to people with relapsing–remitting MS, across a range of disability levels to improve their gait. Further, the addition of VR appears to confer important benefits for cognitive function and mental health. Thus, the present RCT results underscore the idea that both motor and cognitive function can be improved among pwMS, even among patients with relatively advanced disease. Future studies should further explore the long-term effects of this combined motor-cognitive training, perhaps as a disease-modifying therapy [50], and the possibility of slowing down or reversing the natural course of the disease in mild and more advanced patients.